Definition
Walk-forward analysis is a validation protocol in which a model is fit on a window of historical data, evaluated on the subsequent out-of-sample window, then the window is rolled forward and the procedure repeated. The collected out-of-sample results are concatenated to form a single performance series that the analyst evaluates as if it were live trading. It is the closest backtesting approximation to actual deployment because, at every point in the evaluation series, the model used only information that would have been available at that time.
Compared to a single fixed train/test split, walk-forward stresses the model across many regimes, surfaces parameter drift, and dramatically narrows the gap between research performance and live results.
Why it matters
How it works
The procedure is parameterised by four choices. Window length sets how much history each fit sees and trades off responsiveness to recent regimes against statistical stability. Step size sets how often the model is refit — daily, weekly, monthly — and trades off freshness against compute. Whether the training window is rolling (fixed length, oldest data drops off) or expanding (anchored start, grows over time) shapes how heavily older data is weighted. And the lookback horizon for evaluation determines how much out-of-sample history accumulates before any final metric is reported.
Two anchor-stones distinguish a sound walk-forward run from a misleading one. First, hyperparameter selection must happen inside each fit, not across the full sample — selecting hyperparameters using all data and only then walking forward leaks future information into every window. Second, the out-of-sample results must be reported untouched; the analyst does not get to inspect them, adjust the model, and re-run, because that would silently turn out-of-sample into in-sample data. When both rules are obeyed, the resulting performance series is the single most honest read on whether the strategy is real or a backtest mirage.