The Essence of Backtesting

Andrew Manuhutu
May 19
8 min read

In systematic trading, backtesting is a foundational step before deploying any strategy in live market conditions. It allows you to evaluate how your algorithm would have performed using historical price data and helps you identify strengths, weaknesses, and potential failure points.

In this article, I’ll break down what backtesting actually is, why it matters, and how to interpret it properly and effectively. Since my workflow is built around the MetaTrader platform, the examples and explanations will focus on the platform’s integrated Strategy Tester.

What is backtesting?

Backtesting is the process of evaluating a trading strategy by running it on historical market data provided by the broker. In MetaTrader, a built-in trading algorithm is called an Expert Advisor (EA), and the platform provides an integrated Strategy Tester to simulate execution, measure performance metrics, and analyze system behavior.

A common limitation, however, is the modeling quality and data depth offered by different brokers. While some brokers provide extensive real ticks or tick-level history, others supply only limited datasets or low-quality bars. When this happens, traders must import external history via the History Center or connect to a broker that offers a more comprehensive market history.

Although backtesting cannot guarantee future performance, it provides a structured way to evaluate whether a strategy has historically behaved as expected. As with any statistical analysis, historical results are descriptive, not predictive. They show how the system performed under past market conditions, but they always come with assumptions and limitations—often disclosed, but not always fully understood.

The same principle applies to automated trading systems. Strategy Tester reports can be misleading if interpreted without understanding the underlying tick modeling method (such as "Every tick based on real ticks"), the spread simulation (fixed vs. variable), or the statistical methods used. Once you understand how the backtest was generated—what data was used, how the model executed trades, and what metrics were applied—you gain a clearer picture of how the EA actually operates and how reliable its performance metrics may be.

Trade Count vs. Backtest Period

In quantitative strategy evaluation, the sample size (trade count) is a key metric used to assess the statistical robustness of an EA. While some practitioners consider a minimum sample size of 30 trades acceptable, others require 100 or more to ensure the reliability of metrics such as expectancy, drawdown distribution, and Sharpe ratio stability.

However, the required number of trades depends heavily on the strategy’s trade frequency:

High‑frequency or intraday systems naturally generate large datasets, making trade‑count‑based validation meaningful.
Swing‑trading or pattern‑recognition systems may only produce a limited number of signals, making strict trade‑count thresholds impractical.

For these lower-frequency systems, the backtest period becomes significantly more important than the raw number of trades. For example, a backtest spanning 2020–2025 exposes the EA to multiple distinct market regimes and macro environments:

2020: Pandemic shock (lockdowns, liquidity injections, extreme volatility)
2021: Post‑pandemic reopening (demand recovery, supply‑chain distortions)
2022: Inflation shock and geopolitical disruption (energy crisis, commodity volatility)
2023: Global monetary tightening (rapid rate‑hike cycle, risk‑asset repricing)
2024: Disinflation phase (slowing growth, early easing signals)
2025: Mixed expansion (policy divergence, persistent inflation pockets)

Testing across these heterogeneous environments increases the likelihood that the EA is resilient to regime shifts, volatility clustering, and structural breaks.

Therefore, for robust system validation, I prioritize a multi-regime time period over an absolute trade count. Backtesting across 2020–2025 provides a critical stress test to determine whether the EA can maintain its performance across fundamentally different economic conditions.

In my specific example (see Figure 1), the EA generated 92 trading signals during the 2020–2025 backtest period but only executed 46 trades due to strict internal filters.

Backtesting Results — Figure 1: Strategy Tester Report. (**Capital**: EUR 10,000; **Testing period**: 2020-2025; **Risk/Trade**:1%; **Risk‑Reward Ratio** 1:2.5, **Modelling**: “every tick based on real ticks”; **Delays**: “zero latency”)

Backtesting balance graph — Figure 2: Balance Graph of Backtesting results.

Interpreting Your Backtest Metrics — What Actually Matters?

Backtesting throws a massive amount of data at you, and the real skill lies in knowing which numbers matter and how to read them. As a rule of thumb, metrics expressed as factors and ratios are far more meaningful than raw monetary values, because currency-based results depend entirely on position sizing and account balance.

Below are the core metrics you should focus on in your Strategy Tester report (see Figure 1):

History quality. (Target: 90%-100%)

This metric tells you how reliable your historical data is. In MetaTrader 5, a modeling quality in the 90–100% range means your backtest is based on accurate, broker-supplied tick data. Anything lower indicates missing bars or mismatched spreads, meaning your results will not reflect real market behavior.

Profit Factor (Target: >1.5)

The Profit Factor measures gross profit relative to gross loss. A value of 1.0 represents the absolute breakeven point. The higher the factor, the stronger the system's statistical edge.

Recovery Factor (Target: >1.5)

This metric shows how efficiently your EA recovers from drawdowns by dividing net profit by the maximal drawdown. A value above 1.5 suggests your system can swiftly bounce back from equity drops. The higher the number, the more resilient the strategy.

Sharpe Ratio (Target: 1.5 - 3.0)

The Sharpe Ratio is a risk‑adjusted performance score, created by William Sharpe. It quantifies how much excess return you earn per unit of volatility or risk taken. However, values significantly above 3.0 should be scrutinized, as they often flag overfitting (curve fitting) or excessive leverage rather than a holy grail strategy.

Maximal consecutive loss

This metric reveals the worst losing streak your EA experienced in terms of consecutive trades. Understanding this number is essential for managing your psychological expectations and avoiding premature optimization or strategy abandonment during live drawdowns.

Metrics That Require Context

Some numbers must be interpreted in context and cannot be taken at face value. They depend heavily on your strategy’s structure.

Risk-to-Reward Ratio (RRR)

Although the MetaTrader Strategy Tester Report does not explicitly show your average Risk-Reward Ratio (RRR), you can easily calculate it by dividing your average profit by your average loss. It basically describes how much you aim to gain compared to how much you are willing to risk.

To build a sustainable strategy, this metric must remain above 1.0, meaning your average winning trade is larger than your average losing trade. For example, a target RRR of 1:3 means you risk 1 unit of capital to gain 3 units. If your average loss exceeds your average win, the strategy will inevitably lose money over the long term.

As seen in Figure 1, with an average win of €309 and an average loss of €121, the strategy achieves an actual RRR of roughly 1:2.5—which remains a solid and profitable number.

Profit Trades (% of Total).

This represents the win rate of your EA. However, a win rate alone is meaningless without factoring in your RRR. You can be highly profitable with a 30% win rate if your RRR is 1:3. Conversely, you can lose money with a 70% win rate if your RRR is 3:1 (where a single loss wipes out multiple wins).

During backtesting, my EA clocked a 39% win rate. With an edge of RRR 1:2.5, it easily clears the baseline math, suggesting the strategy has the potential to maintain a sustainable, long-term edge (see Figures 1 and 2).

To see how the math works in your favor as your RRR increases, here is the exact minimum win rate required just to break even:

RRR 1:1 → 50.00% win rate
RRR 1:2 → 33.33% win rate
RRR 1:3 → 25.00% win rate
RRR 1:4 → 20.00% win rate
RRR 1:5 → 16.67% win rate.

Expected Payoff Ratio (Target: >0.3)

While MetaTrader displays only the raw Expected Payoff (the average profit/loss per trade in currency), expressing it as a ratio relative to your risk per trade provides deeper insights. Using the metrics from my backtest results (see Figure 1) as an example:

Average Loss: €121 (or about 1% Risk/Trade),
Expected payoff: €45,
The expected payoff ratio: 0.37 (€45 / €121).

This means that for every trade executed, the system mathematically expects to return 37% of the capital risked. You should aim for an Expected Payoff Ratio above 0.30 (or 30%). Anything below this threshold leaves too little room for error, as real-world factors like slippage, broker commissions, and variable spreads can easily eat up the edge.

Which numbers are commonly misinterpreted?

Traders frequently misinterpret three of the most prominent metrics in a Strategy Tester report:

Total Net Profit
Equity Drawdown Maximal /Relative
Expected Payoff

These metrics are often misunderstood because MetaTrader displays them in absolute monetary units by default. Monetary values are highly deceptive; they are completely dependent on your position sizing model, initial account balance, and risk-per-trade parameters..

Consider this example using a fixed fractional position sizing approach:

If you risk 1% on a €10,000 account with a 1:3 RRR, a single winning trade yields €300 (3%). If you increase your risk to 2%, that identical trade yields €600 (6%).

The absolute monetary profit doubles, but the underlying statistical quality and edge of the strategy remain completely unchanged.

This is why chasing raw profit figures misses the entire purpose of backtesting. The goal of a simulation is to evaluate statistical robustness and algorithmic logic, not to admire a hypothetical cash payout. Artificially increasing your leverage or risk will naturally inflate your net profit, but it will simultaneously scale your equity drawdown without altering the strategy's win rate or expectancy.

The same rule applies to the Expected Payoff. On its own, it only reflects the average currency amount gained or lost per trade. As established earlier, a far more meaningful metric is the Expected Payoff Ratio relative to your risk per trade. This ratio isolates the strategy's performance from account size by revealing your average percentage return per unit of risk.

To evaluate an EA properly, you must look past the currency signs. Convert these monetary values into percentages and focus strictly on the statistical behavior of the system.

So, what now?

If you have completed your backtesting and feel confident about the numbers, the next steps are all about grounding those expectations in reality and validating your system under live market conditions.

Manage your expectations

Backtesting results are useful, but they’re also theoretical. They assume perfect execution, zero slippage, and ideal data quality. In my example (Figure 1), my EA is hardcoded to a strict 1:3 RRR, however the actual results only reflect a 1:2.5 ratio.

Because live trading introduces unpredictable variables, it’s smart to scale down your expectations. For example:

A backtested profit factor of 2.0 might translate to 1.4–1.5 in real trading.
A Sharpe Ratio of 2.8 might realistically land around 1.8–2.0.

These are still solid numbers—they’re just more aligned with real‑world market friction. This mindset helps you avoid over‑optimism and keeps your risk management grounded.

Run a demo test (Incubation Phase)

Before going live, put your EA through 3–6 months of demo or paper trading. This is where you build trust in your system and observe how it behaves with real‑time price feeds.

Conduct regular weekly or monthly performance reviews to evaluate where your EA excels, where it struggles, and whether its real-time behavior aligns with your original backtest assumptions. This step is crucial for verifying the algorithm’s operational stability.

Transition to Live Forward Testing

Once the demo results validate the system's logic, transition to live trading. Start with the lowest possible risk parameters (e.g., minimum lot sizes) according to your risk appetite. Continue your regular performance reviews to ensure the EA behaves exactly as engineered.

Forward testing is where you confirm the quality of execution, the slippage impact, the spread sensitivity or real‑world drawdown behavior. If something deviates from expectations, adjust carefully and based on data—not emotion.

Maintain a Defensive and Critical Mindset

Never let your guard down. A successful backtest is a proof of concept, not a guarantee of future market conditions. Markets are dynamic and subject to constant structural breaks. Continuously monitor your EA, question outlier returns, and avoid overconfidence.

If you'd prefer not to navigate this transition alone, I’d be happy to help. I know that inviting a consultant into your private trading process is a big step—but if you're ready for an objective audit of your system’s stability, you’re more than welcome to Book a Strategy Intro.

HAPPY TRADING!

Sygmative

The Essence of Backtesting

Recent Posts