How Many Trades Do You Need to Backtest? (The Sample-Size Math)

Ask ten traders how many trades you need before trusting a backtest and you’ll hear everything from “twenty is plenty” to “a thousand, minimum”. The real answer comes from arithmetic, not opinion — and the arithmetic says small samples are far noisier than most traders’ intuition allows. A strategy can look brilliant over 20 trades and be genuinely worthless.

This guide does the sample-size math in plain language, shows why expectancy converges even more slowly than win rate, and grounds it all in real numbers from our own published research — including a strategy that fired exactly one trade in twelve months and another that settled the question with more than 3,000. It ends with a practical ladder: 30 trades to triage an idea, 100 to take it seriously, 300 to trust it.

Why 20 trades tells you almost nothing

Flip a fair coin 20 times and you’ll regularly see 13 or 14 heads — a “win rate” of 65–70% from a process with zero edge. Trading samples behave the same way: across 20 trades, ordinary luck routinely produces streaks and clusters that look exactly like skill. The numbers feel meaningful because they came from hours of effort, but they carry barely more information than a handful of coin flips.

Put precisely: a 55% win rate measured over 20 trades is statistically indistinguishable from a true 35% strategy and from a true 75% one — the 95% confidence interval at that sample size spans roughly 22 percentage points in each direction. Most traders who “validated” a setup over a couple of dozen trades have measured noise, formed a conviction, and then funded it.

The math, done simply: standard error

You don’t need a statistics degree — one formula covers it. The standard error of a measured win rate is √(p(1−p)/n), where p is the true win rate and n is the number of trades. It tells you how far your measured win rate typically wanders from the truth: about 68% of the time you’ll land within one standard error of the real number, and about 95% of the time within two.

Notice the brutal exchange rate hiding in that square root: to halve your uncertainty, you must quadruple your trade count. That single fact explains why confidence is so expensive in trading — and why shortcuts are so tempting. For a strategy that truly wins 50% of the time, the standard error works out to:

n = 25 trades: √(0.25/25) = ±10 percentage points. Your measured win rate typically lands anywhere between 40% and 60%, and the 95% range is roughly 30–70%.
n = 100 trades: √(0.25/100) = ±5 points. Typically 45–55%, with a 95% range of roughly 40–60%.
n = 250 trades: √(0.25/250) ≈ ±3.2 points. Typically 46.8–53.2%, with a 95% range of roughly 44–56%.

Expectancy converges even more slowly than win rate

Win rate is the friendly case: every trade is a simple yes/no, so the noise shrinks predictably with the formula above. Expectancy — your average profit per trade — is harsher, because trade outcomes are not uniform. A results distribution that includes the occasional +6R or +8R winner has fat tails, and averages of fat-tailed distributions need substantially more data before they settle down.

In a 30-trade sample, a single outsized winner can flip the measured expectancy from negative to positive entirely on its own. Strip out one lucky trade and the “edge” disappears. That is why a small backtest should always be re-read with its best trade removed: if the strategy only works because of one print, you haven’t found an edge — you’ve found an anecdote.

Trade count isn’t enough — you need regime coverage

One hundred trades from a single trending month is a sample of one market regime, not a sample of one hundred trades. Markets alternate between trending, ranging and volatile states, and most strategies are quietly specialised: breakout systems feast in trends and bleed in chop, while mean reversion does the opposite. If all your trades come from one regime, your effective sample size is closer to one.

That is why our own published backtests run across a full twelve months of data (June 2025 – June 2026) rather than a flattering window. Spread your testing across time deliberately: different months, different volatility levels, different sessions. A 100-trade sample spanning a year tells you more than a 300-trade sample squeezed from six good weeks.

What our research shows about sample size

Sample size isn’t academic — it decided what we could and couldn’t conclude in our own research. When we ran the strict ICT Silver Bullet recipe (liquidity sweep, market-structure shift and fair value gap, all in the AM window) mechanically over twelve months of 5-minute BTC data, it fired exactly one trade. One. A strategy that generates one signal a year is untestable in any statistical sense — you cannot distinguish it from a coin you only flipped once. We published a relaxed variant (46 trades) alongside it specifically so the sample-size difference is visible.

At the other extreme, fading liquidity sweeps generated 3,349 trades on BTC alone — and another 3,282 on ETH. At that sample size the verdict was unambiguous: a 21.1% win rate and a 0.22 profit factor are not bad luck, they are the strategy. Big samples don’t guarantee a good strategy — they guarantee an honest answer.

A practical ladder: 30 → 100 → 300

Here is the working ladder we recommend, with what each stage can and cannot tell you:

30 trades — triage. At roughly ±18 points (95% range) you cannot confirm an edge, but you can kill an obviously broken idea: terrible risk-reward, constant stop-outs, a setup that never appears cleanly. Most ideas should die here, cheaply.
100 trades — take it seriously. Uncertainty tightens to roughly ±10 points. If the strategy still looks positive across more than one market regime, it has earned forward testing on data you haven’t seen.
300 trades — trust it, provisionally. Roughly ±6 points at 95% confidence, and enough trades to slice by month, by session and by volatility regime. This is the first rung where position-sizing decisions become defensible.

How to get 300 trades without spending two years

The catch is obvious: collecting 300 live trades on a setup that fires once a day takes more than a year — and you’d be funding the lesson with real money. The fix is to compress time. Bar-by-bar replay lets you trade years of real historical data in days, taking every valid instance of your setup without peeking ahead; an automated engine can sample even faster when your rules are fully mechanical.

Secuora covers both routes: a bar-replay backtester where crypto replay is free (and there’s a live demo at /backtest/demo with no sign-up), plus an AI backtester at /backtest/ai that turns a plain-English strategy description into a deterministic test. Every replay trade is journaled automatically, so your sample builds itself while you practise.

Frequently asked questions

How many trades do I need for a statistically significant backtest?

As a working ladder: 30 trades to discard weak ideas, 100 to take a strategy seriously, and 300+ before trusting it with meaningful size. Even at 100 trades your measured win rate is only accurate to roughly ±10 percentage points at 95% confidence.

Is a 20-trade backtest worth anything?

Only as a first impression. Over 20 trades, ordinary luck routinely produces win rates 20+ points away from the truth, so a 55% result is consistent with anything from a losing strategy to an excellent one. Use it to decide whether to keep testing — never to start sizing up.

Do more trades always make a backtest more reliable?

More trades shrink statistical noise, but only if they span different market conditions. Three hundred trades from one trending quarter still measure a single regime. Aim for both: a large sample and coverage across trending, ranging and volatile periods.

What if my strategy only produces a few trades per year?

Then it cannot be validated statistically — our strict Silver Bullet test produced one trade in twelve months, which proves nothing in either direction. Either relax the rules to generate a testable sample, test across more instruments, or accept that you are trading on faith.