Skip to content
1/1

Quantitative Methods & Risk·Backtesting Methodology

Walk-Forward vs. In-Sample

9 min read

If you only have in-sample, you have nothing

The single most common mistake in retail backtesting is reporting in-sample performance — performance on the same data the strategy was tuned on. It tells you nothing about future performance. Walk-forward analysis fits parameters on a rolling window, then evaluates on the immediately following out-of-sample window, then rolls forward. The aggregate of out-of-sample results is what you report.

walk_forward.py
python
1def walk_forward(df, fit_window=252, test_window=63):
2 results = []
3 for start in range(0, len(df) - fit_window - test_window, test_window):
4 fit = df.iloc[start : start + fit_window]
5 test = df.iloc[start + fit_window : start + fit_window + test_window]
6 params = fit_strategy(fit)
7 oos = evaluate(test, params)
8 results.append(oos)
9 return aggregate(results)
A 1-year fit / 1-quarter test rolling window is a common starting point for daily-bar strategies.
Heads up

Lookahead bias is the silent killer

Even in walk-forward, lookahead bias creeps in through the data pipeline: a feature computed using close-of-day data assumed available at the open, a Z-score computed over the full sample. Audit every feature for the timestamp of its inputs.