Interpreting Results
A backtest produces a wealth of data — performance metrics, equity curves, drawdown charts, and trade-by-trade breakdowns. Understanding what each metric means, what constitutes good versus bad results, and which red flags to watch for is essential to making sound decisions about your strategy.
Key Performance Metrics
Arconomy's backtest report includes the following core metrics. Each one captures a different dimension of your strategy's performance.
Net Profit
The total profit or loss generated by the strategy over the backtest period, after deducting all costs (spread, slippage, commissions). Net profit is expressed both as an absolute currency value and as a percentage return on the initial balance.
While net profit is the most intuitive metric, it should never be evaluated in isolation. A high net profit achieved through excessive risk-taking or a small number of lucky trades is not as valuable as a moderate net profit earned consistently.
Win Rate
The percentage of trades that closed in profit. Calculated as:
Win Rate = (Winning Trades / Total Trades) x 100
A common misconception is that a high win rate is necessary for profitability. Many successful strategies have win rates below 50% — they make money because their winning trades are significantly larger than their losing trades. What matters is the relationship between win rate and the average win-to-loss ratio.
Profit Factor
The ratio of gross profits to gross losses. A profit factor greater than 1.0 means the strategy is profitable overall.
Profit Factor = Gross Profit / Gross Loss
| Profit Factor | Interpretation |
|---|---|
| Below 1.0 | Losing strategy — gross losses exceed gross profits |
| 1.0 - 1.5 | Marginal — profitable but with thin margins that may erode in live trading |
| 1.5 - 2.0 | Good — solid profitability with reasonable margin for real-world friction |
| 2.0 - 3.0 | Very good — strong edge with room for execution imperfections |
| Above 3.0 | Exceptional — verify that results are not overfitted or based on too few trades |
Maximum Drawdown
The largest peak-to-trough decline in the portfolio's equity during the backtest period. Maximum drawdown represents the worst-case loss you would have experienced if you had started trading at the peak.
Max Drawdown = (Peak Equity - Trough Equity) / Peak Equity x 100
Maximum drawdown is arguably the most important risk metric. A strategy that returns 50% annually but has a 60% maximum drawdown requires extraordinary discipline to trade — most traders would abandon it during the drawdown. As a general guideline, a maximum drawdown below 20% is considered manageable for most retail traders.
Sharpe Ratio
A risk-adjusted return metric that measures how much excess return you earn per unit of risk (volatility). Higher values indicate better risk-adjusted performance.
Sharpe Ratio = (Strategy Return - Risk-Free Rate) / Standard Deviation of Returns
| Sharpe Ratio | Interpretation |
|---|---|
| Below 0.5 | Poor risk-adjusted returns |
| 0.5 - 1.0 | Acceptable — comparable to many traditional investment approaches |
| 1.0 - 2.0 | Good — strong risk-adjusted performance |
| 2.0 - 3.0 | Very good — institutional-quality risk-adjusted returns |
| Above 3.0 | Exceptional — rare in practice; verify for overfitting |
Expectancy
The average amount you can expect to make (or lose) per trade. Expectancy combines win rate and average win/loss size into a single value that tells you whether the strategy has a positive edge.
Expectancy = (Win Rate x Average Win) - (Loss Rate x Average Loss)
A positive expectancy means the strategy has an edge. The higher the expectancy, the more you can expect to earn per trade on average. Multiply expectancy by the expected number of trades per period to estimate the strategy's total expected return.
Visual Analysis Tools
Numbers alone do not tell the full story. Arconomy provides several visual tools to help you understand the shape and character of your strategy's performance.
Equity Curve
The equity curve plots your portfolio value over time throughout the backtest period. A healthy equity curve shows a generally upward trajectory with manageable pullbacks. The shape of the curve reveals important characteristics:
- Smooth upward slope — Consistent performance across market conditions. This is the ideal shape.
- Staircase pattern — Periods of flat performance followed by sharp gains. Common in trend-following strategies that wait for large moves.
- Volatile swings — Large gains followed by large losses. Indicates high risk even if the net result is positive.
- Hockey stick — Flat or losing for most of the period, then a sudden large gain. This is a red flag — the profit may depend on a single outlier trade.
Drawdown Chart
The drawdown chart shows the percentage decline from the equity peak at each point in time. It is displayed as a downward-facing area chart, making it easy to see the depth and duration of each drawdown period.
Pay attention to both the depth and duration of drawdowns. A 15% drawdown that recovers in two weeks is very different from a 15% drawdown that takes six months to recover. Longer recovery periods test your patience and confidence in the strategy.
Trade Distribution
The trade distribution chart shows the frequency of trades across different profit/loss ranges. A well-behaved strategy will typically show a distribution with:
- A cluster of small wins and small losses near zero (these are the typical trades)
- A longer tail on the positive side (occasional large wins)
- A shorter, controlled tail on the negative side (losses are capped by risk management)
If the distribution shows a heavy negative tail (occasional very large losses), your risk management rules may need tightening. If all profits come from a tiny number of outlier trades, the strategy may not have a reliable edge.
Good vs Bad Results
There is no universal threshold for what makes a backtest result "good" — it depends on the instrument, timeframe, and your personal risk tolerance. However, the following benchmarks provide useful reference points for evaluating your results:
| Metric | Concerning | Acceptable | Strong |
|---|---|---|---|
| Net Profit (annualised) | Below 0% | 5% - 20% | Above 20% |
| Profit Factor | Below 1.2 | 1.2 - 2.0 | Above 2.0 |
| Max Drawdown | Above 30% | 15% - 30% | Below 15% |
| Sharpe Ratio | Below 0.5 | 0.5 - 1.5 | Above 1.5 |
| Trade Count | Below 30 | 30 - 200 | Above 200 |
These benchmarks assume a strategy with realistic transaction costs. If your backtest does not include spread, slippage, and commissions, the actual live performance will be significantly worse than the backtest suggests.
Red Flags to Watch For
Certain patterns in backtest results should raise immediate concern. If you encounter any of the following, investigate further before considering deployment.
Curve Fitting
Curve fitting (also called overfitting) occurs when a strategy is optimised so heavily against historical data that it captures noise rather than genuine market patterns. Signs of curve fitting include:
- Exceptional performance on in-sample data but poor performance on out-of-sample data
- Many configurable parameters, each fine-tuned to exact values
- Performance degrades rapidly when parameters are changed even slightly
- The strategy works on one specific instrument and date range but fails on others
See Configurable Parameters for techniques to detect and avoid curve fitting.
Results That Seem Too Good
If a strategy shows annualised returns above 100% with a drawdown below 5% and a Sharpe ratio above 5, something is almost certainly wrong. The most common causes are look-ahead bias, missing transaction costs, or data errors. Investigate thoroughly before trusting exceptional results.
Unrealistically good results are more often a sign of an error in the backtest setup than a genuine edge. Common causes include:
- Look-ahead bias — The strategy unknowingly uses future information to make decisions. For example, using end-of-day data for intra-day entries.
- Missing costs — Spread, slippage, or commission settings are set to zero or unrealistically low values.
- Survivorship bias — Testing only on instruments that still exist today, excluding those that were delisted or went bankrupt.
- Data errors — Bad ticks, duplicate data, or incorrect timestamps that create artificial trading opportunities.
Low Trade Count
A strategy that produces fewer than 30 trades in a backtest does not have enough data points for statistical significance. Even if the net profit is positive, the results could easily be the product of chance rather than a genuine edge.
As a general rule, aim for at least 100 trades in your backtest to draw meaningful conclusions. The more trades your strategy generates, the more confident you can be that the results reflect a real pattern in the market.
Inconsistent Performance Across Time
If your strategy earned 90% of its profit in a single month and was flat or losing for the rest of the backtest period, the result is not reliable. Use iterative backtesting to verify that performance is distributed reasonably across different time periods.
What to Do After Reviewing Results
Once you have reviewed your backtest results, the typical next steps are:
- If results are poor — Revisit your strategy design. Examine the event-level replay to understand why individual trades lost money. Adjust your rules and backtest again.
- If results are promising but unvalidated — Run an out-of-sample test or walk-forward analysis using configurable parameters. Ensure the edge persists on data the strategy has never seen.
- If results are validated and robust — Proceed to paper trading on a demo account to verify that live execution matches backtest expectations. Only move to live trading after a successful paper trading period.
The best traders treat backtesting as a continuous process, not a one-time event. Markets evolve, and a strategy that works today may need adjustments in the future. Schedule regular re-evaluations of your deployed strategies using fresh data.
Was this helpful? Let us know