A Deeper Dive: How Model Complexity and Prediction Horizon Shape Optimal Half-Life

In quantitative analysis, selecting the appropriate model "memory" is a fundamental challenge when working with time-series data. This memory, often controlled by a half-life parameter in exponentially weighted methods, dictates the influence of historical observations on a model's predictions. The following analysis explores how the optimal half-life is influenced by a model's structural complexity and the prediction horizon, utilizing a simulation that incorporates fast-reverting "trade impact" noise.

Method: Simulation and Analysis

To investigate this relationship, our simulation was structured as follows:

Synthetic Data Generation: Two assets were simulated with a mean-reverting correlation. The analytical half-life of this correlation was approximately 6.93 days. To create a realistic, noisy environment, a separate, faster mean-reverting process was added to the price series of both assets. This "trade impact" noise had an analytical half-life of approximately 1.39 days, representing high-frequency, idiosyncratic price movements.
Model Architectures:
- Simple Model: An XGBoost regression model using the prior day's returns of both assets (1-day lag) as its predictive features.
- Complex Model: An XGBoost model whose features included the past ten days of returns for both assets (10 lags) and the top three Principal Components (PCs) derived from these lagged returns.
Analysis and Evaluation:
- An exponentially weighted approach was used for training both models, with a range of half-lives being tested.
- Performance was evaluated across five prediction horizons (1, 3, 14, 53, and 200 days) using the weighted average Root Mean Squared Error (RMSE).
- The optimal half-life for each scenario was defined as the one that resulted in the lowest average RMSE over 1000 simulation runs.

Simulation Results

The following table presents the optimal half-life and the corresponding lowest average RMSE for the Simple and Complex models at each prediction horizon.

Predictive Performance Analysis

The graph below visually compares the best average RMSE achieved by both the Simple and Complex models across the different prediction horizons.

The visualization indicates that for prediction horizons of 1 and 3 days, both models exhibit similar predictive performance as measured by RMSE. As the prediction horizon increases beyond 3 days, the Complex Model consistently achieves a lower RMSE than the Simple Model. This suggests that the additional features and complexity of the Complex Model contribute to improved predictive accuracy for longer-term forecasts.

Conclusion

The simulation results demonstrate that a model's optimal half-life is influenced by both its structural complexity and the target prediction horizon. The Complex Model, with its broader feature set, tends to perform best with a longer half-life, particularly for long-term forecasts. This allows it to better leverage historical information to identify patterns. Conversely, the Simple Model, constrained by a single lag, tends to perform optimally with a shorter half-life.

An interesting finding is the peak in performance and convergence of optimal half-lives for both models at the 53-day horizon to 13 days. This can be interpreted in light of the mean-reversion times of the simulation's components. The trade impact noise, with a half-life of ~1.4 days, is highly transient and filtered out by even a short memory. The underlying correlation, however, has a longer half-life of ~6.9 days. The optimal 13-day half-life for the 53-day horizon is long enough to effectively filter out the high-frequency trade noise and capture the dynamics of the underlying correlation, which has now had several cycles to revert. It represents a balanced memory that is long enough to see through the noise but not so long that it incorporates information from an outdated correlation regime. In the past, we found that the best half-life was roughly equal to the mean reversion time of the correlation time series, but that may have been pushed out because of the difficulty separating the fast noise from the slower correlation. This suggests that in environments with fast-reverting noise, there may be a specific memory length that effectively filters out the noise regardless of model complexity.

Search This Blog

Navigating Signal & Noise