Half-Lives for Trading Correlated Assets? Unpacking Correlation Mean Reversion and Predictive Power

Today, we're diving into a fascinating corner of quantitative finance: the interplay between mean-reverting correlations and how we can best predict asset movements, particularly in the context of strategies like pairs trading. If you've ever thought about how the relationship between two assets evolves over time, and how to capture that evolution for better predictions, this post is for you.

Our central question for today's exploration is: What is the relationship between the mean reversion time of a correlation (how quickly it tends to return to its average) and the "best" look-back period (or half-life) to use when trying to predict one asset's returns based on another?

To answer this, I set up a simulation. Here's a quick rundown of my experiment:

  1. Mean-Reverting Correlation: I simulated 50 different "mean-reverting random walks" for the correlation between two hypothetical assets. This means the correlation itself isn't static; it jitters around but constantly tries to pull back towards a long-term target (in my case, 0.6). This mimics how real-world relationships between assets often behave – they don't stay perfectly correlated or uncorrelated forever.


  2. Asset Value Simulation: I then simulated the daily values of two assets. Asset 1 followed a simple random walk, and Asset 2's returns were generated such that they were correlated with Asset 1's returns, with the correlation changing over time as per my mean-reverting correlation walks.

  3. Predicting Asset 2 Returns: This is where the core of our investigation lies. For each simulated pair of assets, I tried to predict the next day's returns of Asset 2 using a linear regression model. My predictors were the previous day's returns for both Asset 1 and Asset 2.

  4. The "Half-Life" Twist: Crucially, I didn't just use a simple linear regression. I employed an exponentially weighted linear regression. This means that more recent data points (returns from yesterday, the day before, etc.) were given more weight than older data points. The rate at which old data "fades" is controlled by a "half-life" parameter – the time it takes for the weight of a data point to halve.

  5. Cross-Validation and RMSE: To evaluate the predictive power for different half-lives, I used a time-series leave-one-out cross-validation (LOOCV). For each point in time, I used all historical data up to that point (weighted by the chosen half-life) to predict the very next day's Asset 2 return. I then measured the Root Mean Squared Error (RMSE) of these predictions. A lower RMSE indicates better predictive accuracy.

The Results: Unveiling the "Best" Half-Life

Before running the cross-validation, I analytically calculated the expected mean reversion half-life of my simulated correlation process. With my chosen parameters (), the expected correlation mean-reversion half-life is approximately 1.39 years.

Then, I ran my cross-validation for a range of half-lives used to weight the data in the regression, from 0.25x to 5x this expected mean-reversion half-life (specifically: 0.35, 0.73, 1.55, 3.28, and 6.93 years). For each of these half-lives, I averaged the RMSE across all 50 simulations.

Here's what my results summary table showed:


More clearly, observe the plot of "Average Cross-Validation RMSE vs. Regression Half-Life":

The vertical red dashed line marks my analytically calculated expected correlation mean-reversion half-life (1.39 years). Notice how the lowest point on the RMSE curve (indicating the data weighting half-life's best predictive performance) aligns remarkably closely with this red line! The half-life of 1.55 years yielded the lowest average RMSE of 0.023831, which is very close to my expected 1.39 years.

The Conclusion: A Powerful Link

My simulation demonstrates that the optimal half-life to use in an exponentially weighted linear regression for predicting correlated asset returns is roughly equal to the mean reversion half-life of the underlying correlation process itself.

This makes intuitive sense. If the correlation between two assets tends to revert to a mean with a certain half-life, then weighting my historical data such that the influence of past observations decays at a similar rate effectively allows my model to "learn" the current, most relevant relationship between the assets. Data that is too old (beyond the correlation's mean reversion time) is less representative of the current regime, while data that is too new (a very short half-life) might overreact to noise and miss the true underlying trend of the correlation.

This finding has significant implications for your quantitative trading strategies, particularly those that rely on dynamic relationships between assets, like pairs trading. The very process of finding that 'best' half-life through empirical testing also reveals important insights into the nature and dynamics of the asset correlations themselves, which you can then leverage for further model improvements.


Comments

Popular posts from this blog

Navigating Colab's Compute: Choosing the Right Architecture for Your XGBoost Training

A Deeper Dive: How Model Complexity and Prediction Horizon Shape Optimal Half-Life

Welcome to Navigating Signal & Noise: A Trader's Journey into Data and Decisions