How to run a walk forward optimisation?

1) Get all relevant data. 2) Break data into multiple pieces. 3) Run an optimisation to find the best parameters on the first piece of data (first in-sample). 4) Apply those parameters on the second piece of data (first out-of-sample) 5) Run an optimisation to find the best parameters on the next in-sample data. 6) Apply those parameters on the next out-of-sample data. 7) Repeat until you've covered all the pieces of data. 8) Collate the performances of all the out-of-sample data.

What is a Walk-Forward Optimization and How to Run It?

Last Updated on June 24, 2020

Backtesting a strategy gives you a good understanding of what happened in the past, but it does not tell the future. Fortunately, walk forward optimisations do tells the future (to some extent).

What is walk-forward optimisation? Walk forward optimisation is a process for testing a trading strategy by finding its optimal trading parameters in a certain time period (called the in-sample or training data) and checking the performance of those parameters in the following time period (called the out-of-sample or testing data).

Table of content:

How to run Walk Forward Optimisations (WFO)?
Why should we run Walk Forward Optimisations?
What is the Difference between In-Sample and Out-of-Sample data?
What is an Objective Function?
What should the Size of my In-Sample and Out-of-Sample Data be?
Does Walk Forward Optimisations Eliminate Overfitting?

How to run Walk Forward Optimisations (WFO)

walk-forward-optimization-chart — Figure 1: Walk forward optimisation chart from 2015.07.01 to 2018.12.31 with 1 year in-sample periods with 0.5 years out-of-sample periods.

Here are the steps to run a walk forward optimisation:

Get all relevant data
Break data into multiple pieces
Run an optimisation to find the best parameters on the first piece of data (first in-sample)
Apply those parameters on the second piece of data (first out-of-sample)
Run an optimisation to find the best parameters on the next in-sample data
Apply those parameters on the next out-of-sample data
Repeat until you’ve covered all the pieces of data
Collate the performances of all the out-of-sample data

Step 1: Get all relevant data

The basic set of data you need is the price data of the financial product you are trading.

For example, if you are running a strategy that involves Uber and Lyft stocks, you need the prices of Uber and Lyft stocks over a period of time.

If your strategy requires any other data, which it normally does, you need those data as well.

These additional data could be the search traffic numbers of Uber and Lyft, amount of comments on ride-share drivers forums or S&P Index price etc.

Step 2: Break data into multiple pieces

Let’s say that we are running our walk forward optimisation from 2015.07.01 to 2018.12.31. (Okay, I know Uber and Lyft weren’t listed during 2015, let’s just pretend they were.)

We break our data up into 10 pieces as seen in figure 1.

Our first piece of data is from 2015.07.01 to 2015.12.31. This is our first in-sample data.

Step 3: Run an optimisation to find the best parameters on the first piece of data

We will run an optimisation on that first in-sample data to find what strategy parameters were the best for that period.

In this example, we shall assume that our strategy is to buy Uber and short Lyft if there are more positive comments about Uber than Lyft in the ride-share drivers forums, vice versa. (Note that this is a grossly simplified strategy.)

Thus, a strategy parameter could be the number of positive comments for Uber divided by the number of positive comments for Lyft.

We shall call this PCUber/PCLyft.

To keep it simple, we have only one strategy parameter. However, most strategies usually have more than one parameter. (Note that having many strategy parameters does not necessarily make your strategy better!)

In our optimisation, we are trying to find out at what values of PCUber/PCLyft should we fire our trades.

Let’s assume that the optimisation results are out – we buy Uber and short Lyft when PCUber/PCLyft is 2 or higher and we short Uber and buy Lyft when PCUber/PCLyft is 0.5 or lower.

These parameters are known as the optimised parameters.

Step 4: Apply those parameters on the second piece of data

Our second piece of data is from 2016.07.01 to 2016.06.30. This is our first out-of-sample data.

We apply the optimised parameter values of 2 and 0.5 (for PCUber/PCLyft) to this second piece of data and check the performance of our strategy.

This performance is what we are interested in as it tells us if the data in the one year (in-sample) period has any predictive value in the proceeding half year (out-of-sample) period.

Step 5: Run an optimisation to find the best parameters on the next in-sample data

We repeat step 3 for our second in-sample data and find the best parameters for PCUber/PCLyft.

Our second in-sample data is from 2016.01.01 to 2016.12.31.

Step 6: Apply those parameters on the next out-of-sample data

We repeat step 4 for our second out-of-sample data.

Our second out-of-sample data is from 2017.01.01 to 2017.06.30.

Step 7: Repeat until you’ve covered all the pieces of data

We repeat this process until we reach our last piece of out-of-sample data.

The last piece of out-of-sample data is from 2018.07.01 to 2018.12.31.

Step 8: Collate the performances of all the out-of-sample data

Collate the performance of your strategy in all the out-of-sample data.

This would be the performance of your strategy using a walk forward optimisation.

This process is much more reliable than using a simple backtest or optimisation.

Why should we run Walk Forward Optimisations

We run walk forward optimisation to reduce overfitting (AKA curve fitting) in our backtests and optimisations.

Overfitting in trading is the process of designing a trading system that adapts so closely to the noise in historical data that it becomes ineffective in the future.

We overfit by adapting our strategies to noise instead of signals. Signals are useful fundamental information. Noise are distractions that don’t offer useful qualities.

More information on overfitting here: What is Overfitting in Trading?

A walk forward optimisation forces us to verify that we are adjusting our strategy parameters to signals in the past by constantly testing our optimised parameters in out-of-sample data.

Inexperience traders tend to spend a lot of time optimising every parameter on the entire set of past data. They then proceed to trade on these “optimised” parameters. This is usually a recipe for disaster.

What is the Difference between In-Sample and Out-of-Sample data

In-sample data is the data we use to train our strategy/model. Out-of-sample data is the data we test our trained strategy/model on.

What is an Objective Function

An objective function is a metric that we maximise or minimise to in our optimisations.

During the optimisation with the in-sample data, we looked for the “best” parameters. To find this out, we look at which parameters maximise or minimise a certain metric.

This metric is called our objective function.

For instance, if our objective function is overall profit, we find the parameters that maximise the overall strategy profit during the in-sample period.

In most cases, we want an objective function that has an element of reward and risk. This is called a risk-adjusted metric.

Making $1,000 while risking $2,000 is worse than making $500 while risking $100, without using a risk-adjusted metric, your system will choose the former.

Some examples of these risk-adjusted objective functions are:

Sharpe Ratio (which is Excess Returns divided by Standard Deviation of Excess Returns)
Returns divided by Maximum Drawdowns
Returns divided by Average Drawdowns

What should the Size of my In-Sample and Out-of-Sample Data be

The size of your in-sample data should be large enough that it can predict certain behaviour in the out-of-sample period but not too large that it incorporates too much noise.

Underlying rationale is that the in-sample data contains some predictive prowess that lets you forecast something in the out-of-sample data.

If your in-sample data is too large, it contains too much false signals. If it is too small, it does not cover enough data to make accurate predictions.

The actual size depends on your trading strategy.

Example 1: Large in-sample vs small out-of-sample

A group of similar bond futures are supposed to move similarly. When they don’t, we trade the anomaly (short the expensive and long the cheap).

A large in-sample dataset is used to get a sense of and model the behaviour of the bonds’ movement.

A smaller sample out-of-sample dataset is preferred as you are regularly adapting your parameters to the recent behaviour of the bonds.

Example 2: Small in-sample vs large out-of-sample

You believe that a stock’s behaviour in the few days after their quarterly earnings and guidance is announced will affect how it behaves in the coming quarter (3 months).

In-sample data is a few days.

Out-of-sample data is 3 months.

Does Walk Forward Optimisations Eliminate Overfitting

No it does not. It reduces overfitting but does not eliminate it completely.

Reason 1: Look-Ahead Bias

When we observe a market inefficiency that occurred in the past, we are fitting to an extent.

For instance, by observing that comments in ride-share forums could predict Uber and Lyft prices and designing a strategy based on this, we are fitting to past data.

Another example will be to spot that Amazon stock trends over the years. We then to run a walk forward optimisation using a trending strategy.

No doubt that the walk forward optimisation performances will be awesome. This doesn’t tell us anything other than the obvious – a trending strategy worked in the past if the stock had a strong trend.

But it’s okay. A small extent of fitting is not a bad thing. In fact, some fitting is required for us to start building strategies, and this is acceptable.

Reason 2: P-Hacking

P-hacking is also-known-as data dredging, data fishing, data snooping and data butchery.

It entails analysis testing many random variations of parameter sets (without a proper hypothesis) and only focusing on the ones which perform well.

Let’s say that we’ve done a walk forward optimisation. The performances were bad.

We then proceed to modify the strategy rules a little and run another walk forward optimisation. The performances were bad again.

This is repeated for 10 times until we found a set of strategy rules that has a positive walk forward optimisation performance.

Have we p-hacked?

The answer is, most likely.

When you run 10 walk forward optimisations (WFO) using variations of a similar strategy to find a good performer, you are most likely fitting to the noise.

What you should be looking for is to run 10 similar variations of a WFO strategy and have most of them produce positive results. This is a sign that you’re on to something legitimate.

What is a Walk-Forward Optimization and How to Run It?

How to run Walk Forward Optimisations (WFO)

Why should we run Walk Forward Optimisations

What is the Difference between In-Sample and Out-of-Sample data

What is an Objective Function

What should the Size of my In-Sample and Out-of-Sample Data be

Does Walk Forward Optimisations Eliminate Overfitting

Related Questions

QuiverQuant – An Introductory Guide to Alternative Data

Build a custom backtester with Python

Blankly – Python Backtesting Guide