Full Page Image

Causal Discovery in Stock Return


Ensemble Deep Learning for Stock Return Prediction in Volatile Markets

Evelyn Huang, Jason Dai, Vivian Zhao, Yishan Cai
xih037@ucsd.edu, rdai@ucsd.edu, vxzhao@ucsd.edu, yic075@ucsd.edu

Mentor: Biwei Huang (bih007@ucsd.edu), Jelena Bradic (jbradic@ucsd.edu)


What is the Problem?

Financial market volatility, driven by economic conditions, corporate shocks, and investor behavior, makes stock return forecasting highly challenging.

Traditional methods, such as macroeconomic analysis and sentiment analysis, have limitations in capturing the complex relationships between stock returns and external factors.

Our goal is to develop an advanced stock return prediction framework that leverages causal discovery algorithms to enhance interpretability and accuracy.



Current Solutions and Their Limitations

Macroeconomics Analysis: there were works that have utilized macro level economics indicators, such as interest rate, inflation, or unemployment rate to predict stock returns, but these methods typically do not have causal links that explains their relationship with the stock return.

Sentiment Analysis: there were works that used news tones to predict stock returns. However, this approach could bot capture the long-term trend.

DeepAR Algorithm People have used deep learning algorithms like DeepAR to predict stock return before, but this approach lacked explainability and transparency.



Proposed Solutions

We propose an advanced stock return prediction framework that integrates PCMCI+ (Peter and Clark Momentary Conditional Independence plus), a causal discovery algorithm, with DeepAR to enhance both interpretability and accuracy. Our approach leverages PCMCI+ for causal structure discovery in our multivariate time-series data, identifying true causal relationships rather than relying on correlations. The model takes in two parts of data: past stock data and tweet sentiment data.

The approach involve two steps:

  1. Feature Selection based on causal relevance to filters out spurious correlations and retains the most influential factors, ensuring the simplicity of model for integrability
  2. Lag Optimization to determine the optimal time dependency, providing clear decision rationales in our model for transparency.

Additionally, we incorporate external economic factors at varying granularities using an economic impact model. By integrating causal inference techniques with deep learning and forecasting models, our framework bridges the gap between predictive power, interpretability, and data availability lag, making advanced stock forecasting more reliable and transparent for investors.