What walk-forward actually means
And why most published backtests cheat without knowing it.
B3Quant Research
Walk-forward is the most-cited and least-understood phrase in quant marketing. It is also the single most important methodological detail that determines whether a strategy's track record means anything. We use it on every model we run. Most published backtests do not.
The naive backtest fits a model on the entire historical dataset, then evaluates that same model on the same dataset. The strategy's Sharpe in this setup is bounded only by the model's flexibility — given enough parameters, you can fit any pattern, real or noise, to perfect P&L. The output is a meaningless number.
Every prediction is generated by a model that has only seen data that existed before the prediction.
A train/test split (say 70/30) is a partial fix. The model is fit only on the first 70% of data, then evaluated on the last 30%. The out-of-sample period now genuinely simulates performance on unseen data. But there's still a hidden problem: the 30% test period is fixed. If you tweak the model design after seeing the test result, you have effectively used the test set as a training set. Most quant teams do this without realising it.
Walk-forward solves this by repeatedly re-fitting on a rolling window. Train on Jan-Dec 2023, evaluate on Jan 2024. Train on Feb 2023-Jan 2024, evaluate on Feb 2024. And so on. Every prediction is generated by a model that has only seen data that existed before the prediction. The 5-year out-of-sample Sharpe you see is the Sharpe a real operator would have achieved trading the strategy in real time.
Our published returns are all walk-forward. The models are refit monthly on a rolling 1-year window. Every NAV row reflects a model that, at that point in time, had no access to anything after that date. This is the only methodology under which a backtest's headline number is comparable to what a subscriber would actually realise.