In-sample vs out-of-sample testing is the process of splitting historical price data into two separate sections — one to build your strategy on and one to test it on as if it were new. Every algorithmic trader who uses backtesting needs to understand this distinction. Without it, a strategy that looks profitable on paper may simply reflect historical noise rather than a genuine edge.

What Is In-Sample vs Out-of-Sample Testing?

In-sample vs out-of-sample testing is a method for validating that a trading strategy performs well on data it has never seen before. The in-sample period is the data you use to build and optimise the strategy. The out-of-sample period is a separate block of data you hold back and only test on after you finish building the strategy.

The two datasets must never overlap. If you use the out-of-sample data to adjust your strategy — even once — it becomes in-sample data. The integrity of the test depends on keeping the two sections completely separate.

Why Does In-Sample vs Out-of-Sample Testing Matter?

Backtesting a strategy on the same data you used to build it almost always produces good-looking results. The strategy has, in effect, memorised the data. It fits the historical price action perfectly but may perform poorly on any data it has not seen before. Traders call this overfitting — when a strategy tunes so tightly to past data that it loses any predictive value going forward.

Out-of-sample testing breaks that trap. When a strategy holds up on data it was never optimised against, you have meaningful evidence that the logic reflects a genuine market behaviour rather than a one-off historical pattern.

A strategy that passes in-sample but fails out-of-sample is a red flag. It tells you the strategy found noise, not signal.

How Do You Split Your Data?

The most common approach is a simple chronological split. Use the earlier portion of your historical data for in-sample development. Reserve the most recent portion for out-of-sample testing. A common starting point is 70% in-sample and 30% out-of-sample, though the right ratio depends on how much total historical data you have available.

What Is Walk-Forward Testing?

Walk-forward testing extends the in-sample/out-of-sample logic across multiple rolling windows. You optimise the strategy on a block of data, test it on the period immediately after, then roll the window forward and repeat the process. The result is a chain of out-of-sample periods that together cover your full dataset. This gives you a more robust picture of how the strategy holds up across different market conditions — not just one historical slice. Read our walk-forward analysis guide to go deeper on this method.

What About Cross-Validation?

Cross-validation comes from machine learning and involves rotating which portions of data act as in-sample and out-of-sample across multiple test rounds. It maximises data usage in small datasets. In trading, it requires care because time-series data has a natural order. You cannot randomly shuffle past and future price data without creating unrealistic test conditions that do not reflect how markets actually work.

What Are the Most Common Mistakes?

Peeking at out-of-sample data early is the most damaging error you can make. If you look at the out-of-sample results and then adjust the strategy, you contaminate the test. The out-of-sample section is no longer truly unseen.

Using too little data produces unreliable results in both sections. A strategy optimised on three months of data and tested on one month proves nothing meaningful. Aim for at least one to two years of in-sample data and six months or more of out-of-sample data, adjusted for your chosen timeframe.

Selecting your test period by performance is another trap to avoid. Some traders test multiple out-of-sample windows and report only the best one. That approach reintroduces selection bias and defeats the purpose of the test entirely.

Expecting perfect results from the out-of-sample period sets up false expectations. A small degradation in performance from in-sample to out-of-sample is normal. What you want to see is that the strategy remains directionally profitable and within a reasonable range of its in-sample metrics — not a perfect match.

How to Apply In-Sample vs Out-of-Sample Testing in Arrow Algo

Arrow Algo’s backtest tool lets you set a specific date range for each test run. To apply an in-sample/out-of-sample split, run your first backtest on the in-sample date range and refine your strategy blocks and parameters based on those results. Once you settle on a final configuration, run a second backtest on the out-of-sample date range without making any further changes to the strategy logic.

Arrow Algo also supports walk-forward testing directly from the platform. You define the rolling window size and the optimisation frequency. Arrow Algo runs the expanding window sequence automatically and reports performance across each out-of-sample window. You see how the strategy holds up over time — not just in one cherry-picked period.

All of this happens within Arrow Algo’s no-code visual builder. You define your strategy logic with drag-and-drop blocks, set the date ranges on the backtest panel, and Arrow Algo delivers the results. For further context on how to interpret those results, read our guide to expectancy as a performance measure. The Investopedia overview of out-of-sample testing provides additional academic context on how this method applies across quantitative disciplines.

What Are the Key Takeaways?

In-sample data builds and optimises the strategy. Out-of-sample data tests it on conditions it has never seen.
Keeping the two datasets completely separate is essential — any overlap invalidates the test.
A strategy that fails out-of-sample is likely overfitted, tuned to past noise rather than genuine market logic.
Walk-forward testing extends this principle across multiple rolling windows for a more rigorous validation.
A small drop in performance from in-sample to out-of-sample is normal and expected.
Arrow Algo’s backtest tool supports both fixed date-range testing and walk-forward testing with no code required.

Educational disclaimer: This content is for educational purposes only and does not constitute financial advice. Trading involves significant risk and you should only trade with capital you can afford to lose. Past performance is not indicative of future results.

Disclaimer: The information provided in this article is for educational purposes only and does not constitute financial advice. Trading involves significant risk and you should only trade with capital you can afford to lose. Past performance is not indicative of future results. Always conduct your own research before making any trading decisions.

Ready to build your own automated trading strategies without writing a single line of code? Start for free at Arrow Algo and join thousands of traders who’ve made the switch to systematic trading.

In-Sample vs Out-of-Sample Testing Explained