Machine Learning for Earnings Prediction: A Nonlinear Tensor Approach for Data Integration and Completion

Abstract

Successful predictive models for financial applications often require harnessing complementary information from multiple datasets. Incorporating data from different sources into a single model can be challenging as they vary in structure, dimensions, quality, and completeness. Simply merging those datasets can cause redundancy, discrepancy, and information loss. This paper proposes a convolutional neural network-based nonlinear tensor coupling and completion framework (NLTCC) to combine heterogeneous datasets without compromising data quality. We demonstrate the effectiveness of NLTCC in solving a specific business problem - predicting firms’ earnings from financial analysts’ earnings forecast. First, we apply NLTCC to fuse firm characteristics and stock market information into the financial analysts’ earnings forecasts data to impute missing values and improve data quality. Subsequently, we predict the next quarter’s earnings based on the imputed data. The experiments reveal that the prediction error decreases by 65% compared with the benchmark analysts’ consensus forecast. The long-short portfolio returns based on NLTCC outperform analysts’ consensus forecast and the S&P-500 index from three-day up to two-month holding period. The prediction accuracy improvement is robust with different performance metrics and various industry sectors. Notably, it is more salient for the sectors with higher heterogeneity.

Publication
2022 International Conference on AI in Finance