# Kalman Filter (02) – S&P 500 and Dow Jones Pairs Trading  Our previous article on Kalman filter gave us a simple linear regression output. The model is designed to handle noisy data, but not on streaming data and time adaptive parameters. This post we will study more general usage of the Kalman filter.

From the linear regression result between S&P 500 and Dow Jons ETFs, a simple strategy can be easily derived. The regression parameters $\alpha$ and $\beta$ give us not just an equation to evaluate the relative value of the price pair, but also confidence ranges for divergent signals. Here we can modify the linear relationship $\text{DIA}_k = \beta_{k} \times \text{SPY}_k + \alpha_k + V_k,$

to a two-sigma event detector below $\text{DIA}_k = \left( \beta_{k} \pm 2 \times \sigma_{\beta_k} \right) \times \text{SPY}_k + \left( \alpha_k \pm 2 \times \sigma_{\alpha_k} \right) + V_k.$

When $\text{DIA}_k > \left( \beta_{k} + 2 \times \sigma_{\beta_k} \right) \times \text{SPY}_k + \left( \alpha_k + 2 \times \sigma_{\alpha_k} \right),$

we might just long the S&P 500 ETF (SPY) and short the Doe Jons ETF (DIA) and wait for them to converge, and vice versa for the case when $\text{DIA}_k < \left( \beta_{k} - 2 \times \sigma_{\beta_k} \right) \times \text{SPY}_k + \left( \alpha_k - 2 \times \sigma_{\alpha_k} \right).$

Moreover, we close the position when the prices converge within the $2\sigma$ range. We provide a sample code below to test the strategy from the year 2000 to 2015. The divergent event is capture by using the $95\%$ confidence interval calculated from the ordinary least-squares fitting results from the statesmodel module.

parameters coef std err t value [95.0% Conf. Int.] $\alpha$ -2.0651 0.385 -5.364 [-2.820, -1.310] $\beta$ 0.8903 0.003 264.258 [0.884, 0.897]

Python code implementation is included in the IPython Notebook file. Let’s check the outcome: There are two difficulties with the strategy.

• The linear regression is derived from the whole history of data. It is impossible for us to use the linear regressed pair to trade from the beginning.
• From our last finding, the parameters $(\alpha_k, \beta_k)$ should not be constant.

We expect to resolve the issues by Kalman filter.

### Kalman Filter

The full model of a Kalman filter can be characterized as $x_{k+1} = A_k x_k + W_k$

and $z_k = B_k x_k + V_k$

where

• $x_k$ are the latent vector,
• $z_k$ are the observation vector,
• $A_k$ are the transition matrices,
• $B_k$ are the observation matrices,
• $W_k$ are the transition white noise following the distribution $\mathscr{N}(0, Q_k)$,
• $V_k$ are the measurement white noise following the distribution $\mathscr{N}(0, R_k)$.

By letting $A_k$ be an identity matrices, $x_k = [\beta_k, \alpha_k]'$, $B_k = [\text{SPY}_k, 1]'$, and $z_k = \text{DIA}_k$, we have the time-adaptive linear regression model between the two prices.

So how this Kalman Filter improve the model?

One of the advantages using Kalman Filter is to find the latent factor distributions. The inferences on these parameters can give us more thoughts on building a mean-reverting trading strategy.

The noise $W_k$ affects latent variable $x$ distribution. We cannot directly observe $x$, so we start with an initial conjecture, then continuously modify $x_k$ mean and variance. The $V_k$ is observation noise. It simply indicates that, based on the information on $x_k$ distribution, how the errors between the estimates and actual values behave. If $V_k$ is large, then in our case the linear regression might not be a good model to capture the relationship between S&P 500 ETF and Dow Jones ETF. We will make an assumption on $Q_k$ and $R_k$.

Okay, now what do we have? We can observe the prices from the market. We assume the linear relationship between the prices dynamics. Time adaptive parameters need an updating mechanism taking in streaming data. Kalman filter has another advantage for online optimization. It can be accomplished by using

filter_update function

from pykalman module.

For pairs trading with Kalman filter, we need to decide the two timings:

1. When to buy and sell?
2. When to close?

The updated state variance in Kalman filter provides a hint on this part since it indicates upper and lower bounds of the difference between DIA and SPY prices derived from the model. Each iteration Kalman filter outputs the updated mean and variance of $x_{k+1} - x_k$, which is estimated at time $k$. The estimated statistics are computed through Kalman gain. Details can be found in many literature. Then we simply enter the positions if the price difference is large that certain standard divation of the price deviation. To achieve it, we compute the estimation error variance $O_k = \text{Var}(x_{k+1} - x_k) \\ P_k = B_k O_k B_k' + Q_k,$

and use it to identify the divergent event. That is, we trade the pair when $|z_k - \text{Mean}(z_k)| > \lambda P_k$ where $\lambda$ controls the number of standard deviation.

By setting $\lambda=1$, we have ### Improved Kalman Filter Pairs Trading

We can increase the accuracy of the linear regression prediction by estimating the changing speed and acceleration of future regression parameters. It is a common method used in signal processing. Through Taylor expansion, we have $x_{k+1} = x_k + x_k' \times \Delta t + \frac{1}{2} x_k'' \times \Delta t^2.$

where $x_k', x_k''$ denote the speed and acceleration of the latent factor $x$ at time $k$. To plug it into our Kalman filter, we can simply take $\Delta t = 1$ and $A_k = \left[ \begin{array}{llllll} 1, 1, \frac{1}{2}, 0, 0,0 \\ 0, 1, 1, 0, 0, 0 \\ 0, 0, 1, 0, 0, 0 \\ 0, 0, 0, 1, 1, \frac{1}{2} \\ 0, 0, 0, 0, 1, 1 \\ 0, 0, 0, 0, 0, 1 \\ \end{array} \right].$

However, the measurement matrices is defined as $B_k = \left[ \begin{array}{llllll} \text{SPY}_k, 0, 0, 0, 0,0 \\ \ \ \ \ \ \ 0, 0, 0, 0, 0, 0 \\ \ \ \ \ \ \ 0, 0, 0, 0, 0, 0 \\ \ \ \ \ \ \ 0, 0, 0, 1, 0, 0 \\ \ \ \ \ \ \ 0, 0, 0, 0, 0, 0 \\ \ \ \ \ \ \ 0, 0, 0, 0, 0, 0 \\ \end{array} \right].$
to maintain the same simple linear regression form.

Just to make the model simpler, we set $\Delta h = 1$ and apply the trick to the first order on $\alpha_k$ and $\beta_k$. The key parts of the code and the result are shown below.

Creating Kalman Filter

[sourcecode language=”python” light=”true” wraplines=”false” collapse=”false”]
# Create a Kalman Filter Object
kf = KalmanFilter(n_dim_obs=1,
n_dim_state=4,
transition_matrices=A,
transition_covariance=Q,
observation_covariance=R,
observation_matrices=array_SPY)
kf = kf.em(df_data.ix[‘1999’][‘DIA’].values)

# Linear Regression Outputs
alpha = kf.initial_state_mean
beta = kf.initial_state_mean
[/sourcecode]

Kalman Filter Update

[sourcecode language=”python” light=”true” wraplines=”false” collapse=”false”]
for row in df_data.ix[‘2000’:].iterrows():
price_spy = row[‘SPY’]
price_dia = row[‘DIA’]

# Kalman filter update
updated_mean, updated_variance = kf.filter_update(filtered_state_mean=updated_mean,
filtered_state_covariance=updated_variance,
observation_matrix=np.expand_dims(np.array([price_spy, 0, 1, 0]), 0),
observation=np.array([price_dia])
)
[/sourcecode]

Partial rule of buying and selling: to long SPY and short DIA we check if $\text{DIA}_k > \beta_{k} \times \text{SPY}_k + \alpha_k + \lambda \times P_k.$

[sourcecode language=”python” light=”true” wraplines=”false” collapse=”false”]
if (price_dia > price_spy * beta + alpha + 1*P and
not long_spy_short_dia): # Long SPY-DIA Spread
if short_spy_long_dia:
total_wealth += shares_spy * (price_spy – cost_spy) + shares_dia * (price_dia – cost_dia)
short_spy_long_dia = False
shares_spy = total_wealth / price_spy
shares_dia = -1 * total_wealth / price_dia
cost_spy = price_spy
cost_dia = price_dia
long_spy_short_dia = True
[/sourcecode]

Performance is improved. ### What’s Next?

We have shown how Kalman filter can used for pairs trading between S&P 500 ETF and Dow Jons ETF. We double the Sharpe ratio by implementing a second-order time adaptive linear regression based on Kalman filter and Taylor expansion. In our next topic on Kalman filter, we will examine the $N$-asset pairs trading and probably non-linear Kalman filter.