Build a custom backtester with Python

14 min read

Get 10-day Free Algo Trading Course

Last Updated on April 3, 2024

Table of contents:

  1. What is a backtester?
  2. Why should I build a custom backtester with Python?
  3. Why shouldn’t I build a custom backtester with Python?
  4. What are some existing Python backtesters?
  5. Getting started
  6. Creating a data handler with the OpenBB Platform
  7. Creating a strategy processor
  8. Creating the main backtester logic
  9. How to backtest a crossover strategy with Python?
  10. How to backtest a mean-reversion strategy with Python?
  11. How to backtest a pairs trading strategy with Python?
  12. How to backtest a strategy with alternative data with Python?
  13. Final thoughts
  14. Full code

What is a backtester?

A backtester is a tool that allows you to test your algorithmic trading strategies against real historical asset data. It helps traders hone their strategies and provides valuable feedback on potential performance.

Read more about it here:

Why should I build a custom backtester with Python?

Building a custom backtester can be challenging but also rewarding. Here are some reasons why you might want to do it:

  • Knowledge – building a custom backtester will expand your knowledge of coding, trading, and more.
  • Ownership and transparency – you will own the entire backtesting pipeline and have full transparency on how the signals are calculated, and how the trades are performed, and have your hand on everything that happens.
  • Personalization and customization – you will be able to completely customize your tool to include all your required features, and personalize it to your preferred user experience.

Why shouldn’t I build a custom backtester with Python?

The reasons why you shouldn’t build a custom backtester are the following:

  • It’s hard – building a good custom backtester with all the bells and whistles can be hard.
  • Existing tools – there are already backtesting tools out there with various degrees of quality. Moreover, you might be able to modify these existing tools to suit your specialized needs.
  • Requires maintenance and time – your backtester is as good as the amount of time and effort you’re willing to put into it to improve its quality.

What are some existing Python backtesters?

Some existing Python backtesters that you might want to check out are the following ones:

Getting started

In this article, we will be creating a simple skeleton of a custom Python backtester that should have the following features

  • Modularity – we want the backtester to be modular so that parts can be easily reorganized, swapped, or built upon.
  • Extendability – the code should be easily extendable.
  • Support single and multi-asset strategies
  • Have access to historical equity data and multiple data providers
  • Incorporate trading fee commission
  • Have performance metrics

To make this possible, we will need to have several key components which are the following:

  • Data Management: Handles the ingestion, storage, and retrieval of OHLCV data, as well as any alternative data sources for generating signals.
  • Signal Generation: Contains the logic for analyzing data and generating buy/sell signals based on predefined strategies or indicators.
  • Execution Engine: Simulates the execution of trades based on signals, considering commissions, slippage, and optionally, bid-ask spreads.
  • Performance Evaluation: Calculates key performance metrics such as return, volatility, Sharpe ratio, drawdowns, etc., to evaluate the strategy’s effectiveness.
  • Utilities: Includes logging, configuration management, and any other supportive functionality.

For this, I’ll be leveraging the power of several Python libraries:

You can find the code in our GitHub repository. Onde cloned, you can install everything by running the following inside your fresh new environment:

poetry install

Allow me to share a bit about my thinking pattern when approaching this.

My overarching design aim is to have a set of modules that govern the outlined key components. In other words, I want to have a module that specializes in data management, a module for trade execution, and so on.

This allows for ease of extendability, it helps to decouple the code, it makes it cleaner, and more. The main pain point I wanted to address here is how hard it is to easily extend and customize existing backtesters out there.

What I disliked about quite a few backtesters is how hard it is to design and run multi-asset strategies, or the fact that they gate-keep the data, that they only allow trading of a particular asset class, and more. All of these things should be mitigated.

The main type of design I was going for was Object Oriented Programming (OOP) where classes are used and it allows us to maintain the state of the backtesting process.

Note: All strategies shown are very basic and for demo and learning purposes only. Please don’t try to use them in a real market setting.

Creating a data handler with the OpenBB Platform

Creating a data handler with the OpenBB Platform is a rather straightforward experience. All headaches based on different API conventions, different providers, messy outputs, data validation, and the like are being handled for us.

It also mitigates the need to create custom classes for data validation and processing. It allows you to seamlessly have access to many data providers, over hundreds of data points, different asset classes, and more. It also guarantees what is returned based on the standard it implements.

Saying that, I’ll stick with just the equity assets and constrain it to daily candles. You can easily expand this and change it to your liking. I’ll allow the user to change the provider, symbol, start and end dates.

What I like about the OpenBB Platform is that it has endpoints that allow you to pass multiple tickers and this is one of them. This means that we are already on a good track of supporting multiple asset trading by passing a comma-separated list of symbols.

To set up the OpenBB Platform, I advise following this guide here.

Here is the DataHandler code:

"""Data handler module for loading and processing data."""

from typing import Optional

import pandas as pd
from openbb import obb


class DataHandler:
    """Data handler class for loading and processing data."""

    def __init__(
        self,
        symbol: str,
        start_date: Optional[str] = None,
        end_date: Optional[str] = None,
        provider: str = "fmp",
    ):
        """Initialize the data handler."""
        self.symbol = symbol.upper()
        self.start_date = start_date
        self.end_date = end_date
        self.provider = provider

    def load_data(self) -> pd.DataFrame | dict[str, pd.DataFrame]:
        """Load equity data."""
        data = obb.equity.price.historical(
            symbol=self.symbol,
            start_date=self.start_date,
            end_date=self.end_date,
            provider=self.provider,
        ).to_df()

        if "," in self.symbol:
            data = data.reset_index().set_index("symbol")
            return {symbol: data.loc[symbol] for symbol in self.symbol.split(",")}

        return data

    def load_data_from_csv(self, file_path) -> pd.DataFrame:
        """Load data from CSV file."""
        return pd.read_csv(file_path, index_col="date", parse_dates=True)

Notice how it returns a dictionary of Pandas dataframes when multiple symbols are being passed. I’ve also added a function that can load data from a custom CSV file and use the date column as its index. Feel free to expand and change this to your liking and needs.

To get some data, all we need to do is to initialize the class and call the load_data method like this:

data = DataHandler("AAPL").load_data()
data.head()

Creating a strategy processor

The next step is to have a module that will process our strategies. By this, I mean to say something that would be able to generate signals based on the strategy requirements and append them to the data so that they can be used by the executor for backtesting.

What I’m going for here is something like a base class for Strategies that developers can inherit from, change, or build their own custom ones. I also want it to work seamlessly when multiple assets so it applies the same signal logic over multiple assets.

Here is what the code for it looks like:

class Strategy:
    """Base class for trading strategies."""

    def __init__(self, indicators: dict, signal_logic: Any):
        """Initialize the strategy with indicators and signal logic."""
        self.indicators = indicators
        self.signal_logic = signal_logic

    def generate_signals(
        self, data: pd.DataFrame | dict[str, pd.DataFrame]
    ) -> pd.DataFrame | dict[str, pd.DataFrame]:
        """Generate trading signals based on the strategy's indicators and signal logic."""
        if isinstance(data, dict):
            for _, asset_data in data.items():
                self._apply_strategy(asset_data)
        else:
            self._apply_strategy(data)
        return data

    def _apply_strategy(self, df: pd.DataFrame) -> None:
        """Apply the strategy to a single dataframe."""
        for name, indicator in self.indicators.items():
            df[name] = indicator(df)

        df["signal"] = df.apply(lambda row: self.signal_logic(row), axis=1)
        df["positions"] = df["signal"].diff().fillna(0)

It works by taking a dictionary of indicators that need to be computed and also the logic to use for generating the signals that can be -1 for selling and +1 for buying. It also keeps track of the positions we’re in.

The way it is coded right now is that we pass it lambda functions which it applies to the dataframe.

Here’s an example of how we can use it on the data we retrieved in the previous step:

strategy = Strategy(
    indicators={
        "sma_20": lambda row: row["close"].rolling(window=20).mean(),
        "sma_60": lambda row: row["close"].rolling(window=60).mean(),
    },
    signal_logic=lambda row: 1 if row["sma_20"] > row["sma_60"] else -1,
)
data = strategy.generate_signals(data)
data.tail()

In the above example, I created a slow and fast-moving average on the closing prices and then defined my trading logic where I want to long when the fast-moving average crosses over the slow-moving average and vice-versa.

Now that we have a way to get data and generate trading signals, all we’re missing is a way to actually run the backtest. This is the most complex part.

Creating the main backtester logic

The main backtester logic will be comprised of several parts. The main parts that we need to have are these:

  • Trade executor
  • Commission calculator
  • Performance metric calculator
  • Portfolio handler
  • The glue between all of them

Let us begin by defining the class and setting some basic variables we want it to handle:

class Backtester:
    """Backtester class for backtesting trading strategies."""

    def __init__(
        self,
        initial_capital: float = 10000.0,
        commission_pct: float = 0.001,
        commission_fixed: float = 1.0,
    ):
        """Initialize the backtester with initial capital and commission fees."""
        self.initial_capital: float = initial_capital
        self.commission_pct: float = commission_pct
        self.commission_fixed: float = commission_fixed
        self.assets_data: Dict = {}
        self.portfolio_history: Dict = {}
        self.daily_portfolio_values: List[float] = []

Now, we will define the trade executor:

 def execute_trade(self, asset: str, signal: int, price: float) -> None:
    """Execute a trade based on the signal and price."""
    if signal > 0 and self.assets_data[asset]["cash"] > 0:  # Buy
        trade_value = self.assets_data[asset]["cash"]
        commission = self.calculate_commission(trade_value)
        shares_to_buy = (trade_value - commission) / price
        self.assets_data[asset]["positions"] += shares_to_buy
        self.assets_data[asset]["cash"] -= trade_value
    elif signal < 0 and self.assets_data[asset]["positions"] > 0:  # Sell
        trade_value = self.assets_data[asset]["positions"] * price
        commission = self.calculate_commission(trade_value)
        self.assets_data[asset]["cash"] += trade_value - commission
        self.assets_data[asset]["positions"] = 0

The trade executor will buy the asset if the signal is greater than 0 and sell the asset if it is less than 0. It will also make sure that we have cash in order to buy and that we are in a position to be able to sell. It will also calculate how many shares we can buy and account for the exchange commission.

To calculate the commission, we do the following:

def calculate_commission(self, trade_value: float) -> float:
    """Calculate the commission fee for a trade."""
    return max(trade_value * self.commission_pct, self.commission_fixed)

Now, we need to track our positions for the assets we are trading and their values and history:

def update_portfolio(self, asset: str, price: float) -> None:
    """Update the portfolio with the latest price."""
    self.assets_data[asset]["position_value"] = (
        self.assets_data[asset]["positions"] * price
    )
    self.assets_data[asset]["total_value"] = (
        self.assets_data[asset]["cash"] + self.assets_data[asset]["position_value"]
    )
    self.portfolio_history[asset].append(self.assets_data[asset]["total_value"])

Finally, the backtester can now be run by using these methods like this:

def backtest(self, data: pd.DataFrame | dict[str, pd.DataFrame]):
    """Backtest the trading strategy using the provided data."""
    if isinstance(data, pd.DataFrame):  # Single asset
        data = {
            "SINGLE_ASSET": data
        }  # Convert to dict format for unified processing

    for asset in data:
        self.assets_data[asset] = {
            "cash": self.initial_capital / len(data),
            "positions": 0,
            "position_value": 0,
            "total_value": 0,
        }
        self.portfolio_history[asset] = []

        for date, row in data[asset].iterrows():
            self.execute_trade(asset, row["signal"], row["close"])
            self.update_portfolio(asset, row["close"])
            if len(self.daily_portfolio_values) < len(data[asset]):
                self.daily_portfolio_values.append(
                    self.assets_data[asset]["total_value"]
                )
            else:
                self.daily_portfolio_values[
                    len(self.portfolio_history[asset]) - 1
                ] += self.assets_data[asset]["total_value"]

Now, I’ll add a method to calculate some metrics and this can be expanded by using third-party libraries or the like. I’ll also do the same for the plotting features. The exact code can be seen in the repo.

def calculate_performance(self, plot: bool = True) -> None:
    """Calculate the performance of the trading strategy."""
    if not self.daily_portfolio_values:
        print("No portfolio history to calculate performance.")
        return

    portfolio_values = pd.Series(self.daily_portfolio_values)
    daily_returns = portfolio_values.pct_change().dropna()

    total_return = calculate_total_return(
        portfolio_values.iloc[-1], self.initial_capital
    )
    annualized_return = calculate_annualized_return(
        total_return, len(portfolio_values)
    )
    annualized_volatility = calculate_annualized_volatility(daily_returns)
    sharpe_ratio = calculate_sharpe_ratio(annualized_return, annualized_volatility)
    sortino_ratio = calculate_sortino_ratio(daily_returns, annualized_return)
    max_drawdown = calculate_maximum_drawdown(portfolio_values)

    print(f"Final Portfolio Value: {portfolio_values.iloc[-1]:.2f}")
    print(f"Total Return: {total_return * 100:.2f}%")
    print(f"Annualized Return: {annualized_return * 100:.2f}%")
    print(f"Annualized Volatility: {annualized_volatility * 100:.2f}%")
    print(f"Sharpe Ratio: {sharpe_ratio:.2f}")
    print(f"Sortino Ratio: {sortino_ratio:.2f}")
    print(f"Maximum Drawdown: {max_drawdown * 100:.2f}%")

    if plot:
        self.plot_performance(portfolio_values, daily_returns)

def plot_performance(self, portfolio_values: Dict, daily_returns: pd.DataFrame):
    """Plot the performance of the trading strategy."""
    plt.figure(figsize=(10, 6))

    plt.subplot(2, 1, 1)
    plt.plot(portfolio_values, label="Portfolio Value")
    plt.title("Portfolio Value Over Time")
    plt.legend()

    plt.subplot(2, 1, 2)
    plt.plot(daily_returns, label="Daily Returns", color="orange")
    plt.title("Daily Returns Over Time")
    plt.legend()

    plt.tight_layout()
    plt.show()

Now that the backtester is ready to go, let us try it out with a couple of different strategies.

How to backtest a crossover strategy with Python?

The goal for this strategy will be to create a very basic crossover strategy where we use a fast-moving simple moving average (SMA) and a slow-moving one. When the fast crosses above the slow we buy and vice-versa.

We will be trading the AAPL stock for this. Here is how we can do it:

from backtester.data_handler import DataHandler
from backtester.backtester import Backtester
from backtester.strategies import Strategy

symbol = "AAPL,MSFT"
start_date = "2023-01-01"
end_date = "2023-12-31"

data = DataHandler(
        symbol=symbol, start_date=start_date, end_date=end_date
    ).load_data()

# Define your strategy, indicators, and signal logic here
strategy = Strategy(
    indicators={
        "sma_20": lambda row: row["close"].rolling(window=20).mean(),
        "sma_60": lambda row: row["close"].rolling(window=60).mean(),
    },
    signal_logic=lambda row: 1 if row["sma_20"] > row["sma_60"] else -1,
)
data = strategy.generate_signals(data)

backtester = Backtester()
backtester.backtest(data)
backtester.calculate_performance()
Final Portfolio Value: 11804.58
Total Return: 18.05%
Annualized Return: 18.20%
Annualized Volatility: 13.06%
Sharpe Ratio: 1.39
Sortino Ratio: 2.06
Maximum Drawdown: -12.07%

Great! The Backtester is working as it should. To test a single asset, we just need to change the symbol to a single one. For example, to backtest NFLX with the same strategy all I need to change is that, and here are the results:

Final Portfolio Value: 13999.01
Total Return: 39.99%
Annualized Return: 40.37%
Annualized Volatility: 22.55%
Sharpe Ratio: 1.79
Sortino Ratio: 1.94
Maximum Drawdown: -15.62%

How to backtest a mean-reversion strategy with Python?

To backtest a mean-reversion strategy with Python, we will use our custom backtester and leverage its modularity and ease of chaining operations. First, let us lay out the strategy logic:

The strategy has a goal to sell the asset if it is trading more than 3 standard deviations above the rolling mean and to buy the asset if it is trading more than 3 standard deviations below the rolling mean.

This has a couple of implications for it to work properly:

  • We need to have a rolling mean
  • We need to calculate the STD from the rolling mean
  • We need to calculate the upper and lower bounds

Because our Strategy class applies calculations in the given order, we can easily chain these calculations together by following their logical order and creating signals based on them.

Let us start by defining the base backtesting parameters:

symbol = "HE"
start_date = "2022-01-01"
end_date = "2022-12-31"

Now, all we need to do is to get the data, chain the operations together, and see what the results are:

data = DataHandler(symbol=symbol, start_date=start_date, end_date=end_date).load_data()

# Define your strategy, indicators, and signal logic here
strategy = Strategy(
    indicators={
        "sma_50": lambda row: row["close"].rolling(window=50).mean(),
        "std_3": lambda row: row["close"].rolling(window=50).std() * 3,
        "std_3_upper": lambda row: row["sma_50"] + row["std_3"],
        "std_3_lower": lambda row: row["sma_50"] - row["std_3"],
    },
    signal_logic=lambda row: (
        1
        if row["close"] < row["std_3_lower"]
        else -1 if row["close"] > row["std_3_upper"] else 0
    ),
)
data = strategy.generate_signals(data)

backtester = Backtester()
backtester.backtest(data)
backtester.calculate_performance()
Final Portfolio Value: 10725.54
Total Return: 7.26%
Annualized Return: 7.29%
Annualized Volatility: 18.32%
Sharpe Ratio: 0.40
Sortino Ratio: 0.53
Maximum Drawdown: -23.37%

We can now easily also run experiments by chaining more operations or changing them. For example, what happens if we base the STD off the rolling mean instead?

"std_3": lambda row: row["sma_50"].std() * 3,
Final Portfolio Value: 12062.36
Total Return: 20.62%
Annualized Return: 20.71%
Annualized Volatility: 13.12%
Sharpe Ratio: 1.58
Sortino Ratio: 1.71
Maximum Drawdown: -7.19%

How to backtest a pairs trading strategy with Python?

Backtesting a pairs trading strategy with Python is an even more complex example. But, our backtester shouldn’t have issues executing it. The main thing that makes it more complex here is that we will want to have data for both assets in a single dataframe. Let us define the strategy first.

The assets that we will trade are Roku (ROKU) and Netflix (NFLX) as we already have a sense of their cointegrated nature based on our previous articles and analyses.

We will enter a position (buy) if one stock has moved 5% or more than the other one over the course of the last five days. We will sell the top one and buy the bottom one until it reverses. Let us set up everything and do some quick data wrangling:

import pandas as pd

symbol = "NFLX,ROKU"
start_date = "2023-01-01"

data = DataHandler(
    symbol=symbol,
    start_date=start_date,
).load_data()

data = pd.merge(
    data["NFLX"].reset_index(),
    data["ROKU"].reset_index(),
    left_index=True,
    right_index=True,
    suffixes=("_NFLX", "_ROKU"),
)

# We want to trade the ROKU stock so we rename the close_ROKU column to close
data = data.rename(columns={"close_ROKU": "close"})
data.head()

Now, all we require is the trading logic and we can run the backtester:

strategy = Strategy(
    indicators={
        "day_5_lookback_NFLX": lambda row: row["close_NFLX"].shift(5),
        "day_5_lookback_ROKU": lambda row: row["close"].shift(5),
    },
    signal_logic=lambda row: (
        1
        if row["close_NFLX"] > row["day_5_lookback_NFLX"] * 1.05
        else -1 if row["close_NFLX"] < row["day_5_lookback_NFLX"] * 0.95 else 0
    ),
)
data = strategy.generate_signals(data)

backtester = Backtester()
backtester.backtest(data)
backtester.calculate_performance()
Final Portfolio Value: 14387.50
Total Return: 43.88%
Annualized Return: 34.80%
Annualized Volatility: 55.77%
Sharpe Ratio: 0.62
Sortino Ratio: 0.74
Maximum Drawdown: -39.86%

How to backtest a strategy with alternative data with Python?

To backtest a strategy with alternative data with Python, all we need to do is to use the custom backtester to load our custom dataset that will be used for trading. Alternatively, we can also combine alternative data with the data fetched from the DataHandler.

For this use case, I’ll be loading a custom CSV file that has a signal column that is calculated based on the sentiment score that was extracted. If the sentiment is positive we will buy and vice-versa.

We load the data and run the backtester:

data = DataHandler(symbol="HE").load_data_from_csv("example_data.csv")

strategy = Strategy(
    indicators={},
    signal_logic=lambda row: (
        1
        if row["trade_signal_sentiment"] > 0
        else -1
    ),
)
data = strategy.generate_signals(data)

backtester = Backtester()
backtester.backtest(data)
backtester.calculate_performance()
Final Portfolio Value: 9128.27
Total Return: -8.72%
Annualized Return: -8.75%
Annualized Volatility: 17.93%
Sharpe Ratio: -0.49
Sortino Ratio: -0.58
Maximum Drawdown: -19.98%

Notice how we didn’t need any indicators and just passed an empty dictionary. Working with this skeleton is quite flexible.

Final thoughts

Sometimes, the solutions that are out there aren’t quite the right fit for your needs and they are not easy to adapt, extend, modify, or work with. Some of them are quite good but not maintained and have so many dependencies that they become unstable.

There are times when it makes sense to spend some time creating a custom tool that will help you with your day-to-day tasks. I have shown you a simple backtester skeleton that can easily be adapted, changed, modified, and worked with.

It can be further polished, have extensions, more metrics, more charts, or anything that you might need. The code is open-sourced and you can play with it, create PRs, and more.

The same philosophy can be applied to other tools that you might be interested in, not only backtesting.

Full code

GitHub Link

Igor Radovanovic