Financial market predictability with artificial intelligence and machine learning techniques
Abstract
This thesis explores the intersection of financial markets and machine learning, focusing on financial return predictability and the imputation of missing values in financial datasets. The research aims to enhance our understanding of financial market dynamics through the lens of interpretable machine learning models. Specifically, the thesis employs advanced machine learning techniques to predict financial returns and address missing data issues, which are common but often overlooked in financial literature.
The first chapter uses an interpretable machine learning model, LassoNet, to forecast U.S. industry portfolio returns. LassoNet combines a regularization mechanism with a neural network architecture to enforce covariate sparsity. The findings show that LassoNet outperforms linear and non-linear models in forecasting accuracy, with valuation ratios and individual and cross-industry lagged returns being the most critical covariates. The model's forecasts enable the construction of profitable industry ETF portfolios that outperform benchmarks in annualized returns, Sharpe ratios, and alpha values.
The second chapter focuses on imputing missing hedge fund return data using a deep learning model, the bidirectional recurrent imputation network for time series (BRITS). BRITS is compared to other imputation methods like the cross-sectional mean and matrix completion. The results indicate that BRITS significantly enhances forecasting accuracy and economic performance of predictive models when used to impute the missing values in the data. The imputed data leads to lower out-of-sample errors and higher investment returns, demonstrating BRITS' superiority in handling missing values.
In the third chapter, the state-of-the-art neural network architecture TabNet is utilized to forecast the directional movements of excess returns in industry portfolios. TabNet surpasses other models in classification accuracy and highlights the importance of valuation ratios and lagged returns in its predictions. The model effectively captures seasonal effects and cross-industry economic links and attains the highest annualized returns and positive Sharpe ratios in trading applications.
Type
Thesis, PhD Doctor of Philosophy
Collections
Description of related resources
Zografopoulos, L., Iannino, M. C., Psaradellis, I., & Sermpinis, G. (2025). Industry return prediction via interpretable deep learning. European Journal of Operational Research, 321(1), 257-268. Advance online publication. https://doi.org/10.1016/j.ejor.2024.08.032Related resources
https://doi.org/10.1016/j.ejor.2024.08.032Items in the St Andrews Research Repository are protected by copyright, with all rights reserved, unless otherwise indicated.