Skip to content

Kalman Filter (02) – S&P 500 and Dow Jones Pairs Trading

Kalman 02

Our previous article on Kalman filter gave us a simple linear regression output. The model is designed to handle noisy data, but not on streaming data and time adaptive parameters. This post we will study more general usage of the Kalman filter.

Simple Pairs Trading

From the linear regression result between S&P 500 and Dow Jons ETFs, a simple strategy can be easily derived. The regression parameters \alpha and \beta give us not just an equation to evaluate the relative value of the price pair, but also confidence ranges for divergent signals. Here we can modify the linear relationship

\text{DIA}_k = \beta_{k} \times \text{SPY}_k + \alpha_k + V_k,

to a two-sigma event detector below

\text{DIA}_k = \left( \beta_{k} \pm 2 \times \sigma_{\beta_k} \right) \times \text{SPY}_k + \left( \alpha_k \pm 2 \times \sigma_{\alpha_k} \right) + V_k.


\text{DIA}_k > \left( \beta_{k} + 2 \times \sigma_{\beta_k} \right) \times \text{SPY}_k + \left( \alpha_k + 2 \times \sigma_{\alpha_k} \right),

we might just long the S&P 500 ETF (SPY) and short the Doe Jons ETF (DIA) and wait for them to converge, and vice versa for the case when

\text{DIA}_k < \left( \beta_{k} - 2 \times \sigma_{\beta_k} \right) \times \text{SPY}_k + \left( \alpha_k - 2 \times \sigma_{\alpha_k} \right).

Moreover, we close the position when the prices converge within the 2\sigma range. We provide a sample code below to test the strategy from the year 2000 to 2015. The divergent event is capture by using the 95\% confidence interval calculated from the ordinary least-squares fitting results from the `statesmodel` module.

parameters coef std err t value [95.0% Conf. Int.]
\alpha -2.0651 0.385 -5.364 [-2.820, -1.310]
\beta 0.8903 0.003 264.258 [0.884, 0.897]

Python code implementation is included in the IPython Notebook file. Let’s check the outcome:

Kalman 02

There are two difficulties with the strategy.

  • The linear regression is derived from the whole history of data. It is impossible for us to use the linear regressed pair to trade from the beginning.
  • From our last finding, the parameters (\alpha_k, \beta_k) should not be constant.

We expect to resolve the issues by Kalman filter.

Kalman Filter

The full model of a Kalman filter can be characterized as

x_{k+1} = A_k x_k + W_k


z_k = B_k x_k + V_k


  • x_k are the latent vector,
  • z_k are the observation vector,
  • A_k are the transition matrices,
  • B_k are the observation matrices,
  • W_k are the transition white noise following the distribution \mathscr{N}(0, Q_k),
  • V_k are the measurement white noise following the distribution \mathscr{N}(0, R_k).

By letting A_k be an identity matrices, x_k = [\beta_k, \alpha_k]', B_k = [\text{SPY}_k, 1]', and z_k = \text{DIA}_k, we have the time-adaptive linear regression model between the two prices.

So how this Kalman Filter improve the model?

One of the advantages using Kalman Filter is to find the latent factor distributions. The inferences on these parameters can give us more thoughts on building a mean-reverting trading strategy.

The noise W_k affects latent variable x distribution. We cannot directly observe x, so we start with an initial conjecture, then continuously modify x_k mean and variance. The V_k is observation noise. It simply indicates that, based on the information on x_k distribution, how the errors between the estimates and actual values behave. If V_k is large, then in our case the linear regression might not be a good model to capture the relationship between S&P 500 ETF and Dow Jones ETF. We will make an assumption on Q_k and R_k.

Okay, now what do we have? We can observe the prices from the market. We assume the linear relationship between the prices dynamics. Time adaptive parameters need an updating mechanism taking in streaming data. Kalman filter has another advantage for online optimization. It can be accomplished by using

filter_update function

from pykalman module.

For pairs trading with Kalman filter, we need to decide the two timings:

  1. When to buy and sell?
  2. When to close?

The updated state variance in Kalman filter provides a hint on this part since it indicates upper and lower bounds of the difference between DIA and SPY prices derived from the model. Each iteration Kalman filter outputs the updated mean and variance of x_{k+1} - x_k, which is estimated at time k. The estimated statistics are computed through Kalman gain. Details can be found in many literature. Then we simply enter the positions if the price difference is large that certain standard divation of the price deviation. To achieve it, we compute the estimation error variance

O_k = \text{Var}(x_{k+1} - x_k) \\          P_k = B_k O_k B_k' + Q_k,

and use it to identify the divergent event. That is, we trade the pair when |z_k - \text{Mean}(z_k)| > \lambda P_k where \lambda controls the number of standard deviation.

By setting \lambda=1, we have

Kalman 03

Improved Kalman Filter Pairs Trading

We can increase the accuracy of the linear regression prediction by estimating the changing speed and acceleration of future regression parameters. It is a common method used in signal processing. Through Taylor expansion, we have

x_{k+1} = x_k + x_k' \times \Delta t + \frac{1}{2} x_k'' \times \Delta t^2.

where x_k', x_k'' denote the speed and acceleration of the latent factor x at time k. To plug it into our Kalman filter, we can simply take \Delta t = 1 and

A_k = \left[ \begin{array}{llllll}      1, 1, \frac{1}{2}, 0, 0,0  \\      0, 1, 1, 0, 0, 0  \\      0, 0, 1, 0, 0, 0 \\      0, 0, 0, 1, 1, \frac{1}{2} \\      0, 0, 0, 0, 1, 1 \\      0, 0, 0, 0, 0, 1 \\      \end{array} \right].

However, the measurement matrices is defined as

B_k = \left[ \begin{array}{llllll}      \text{SPY}_k, 0, 0, 0, 0,0  \\      \ \ \ \ \ \ 0, 0, 0, 0, 0, 0  \\      \ \ \ \ \ \ 0, 0, 0, 0, 0, 0 \\      \ \ \ \ \ \ 0, 0, 0, 1, 0, 0 \\      \ \ \ \ \ \ 0, 0, 0, 0, 0, 0 \\      \ \ \ \ \ \ 0, 0, 0, 0, 0, 0 \\      \end{array} \right].
to maintain the same simple linear regression form.

Just to make the model simpler, we set \Delta h = 1 and apply the trick to the first order on \alpha_k and \beta_k. The key parts of the code and the result are shown below.

Creating Kalman Filter

[sourcecode language=”python” light=”true” wraplines=”false” collapse=”false”]
# Create a Kalman Filter Object
kf = KalmanFilter(n_dim_obs=1,
kf = kf.em(df_data.ix[‘1999’][‘DIA’].values)

# Linear Regression Outputs
alpha = kf.initial_state_mean[3]
beta = kf.initial_state_mean[0]

Kalman Filter Update

[sourcecode language=”python” light=”true” wraplines=”false” collapse=”false”]
for row in df_data.ix[‘2000’:].iterrows():
price_spy = row[1][‘SPY’]
price_dia = row[1][‘DIA’]

# Kalman filter update
updated_mean, updated_variance = kf.filter_update(filtered_state_mean=updated_mean,
observation_matrix=np.expand_dims(np.array([price_spy, 0, 1, 0]), 0),

Partial rule of buying and selling: to long SPY and short DIA we check if

\text{DIA}_k > \beta_{k} \times \text{SPY}_k + \alpha_k + \lambda \times P_k.

Trading (Long SPY, Short DIA)

[sourcecode language=”python” light=”true” wraplines=”false” collapse=”false”]
if (price_dia > price_spy * beta + alpha + 1*P and
not long_spy_short_dia): # Long SPY-DIA Spread
if short_spy_long_dia:
total_wealth += shares_spy * (price_spy – cost_spy) + shares_dia * (price_dia – cost_dia)
short_spy_long_dia = False
shares_spy = total_wealth / price_spy
shares_dia = -1 * total_wealth / price_dia
cost_spy = price_spy
cost_dia = price_dia
long_spy_short_dia = True

Performance is improved.

Kalman 04

What’s Next?

We have shown how Kalman filter can used for pairs trading between S&P 500 ETF and Dow Jons ETF. We double the Sharpe ratio by implementing a second-order time adaptive linear regression based on Kalman filter and Taylor expansion. In our next topic on Kalman filter, we will examine the N-asset pairs trading and probably non-linear Kalman filter.

Leave a Reply

%d bloggers like this: