Asset pricing model using multilayer perceptron model initially trained on chaotic time-series data and retrained on financial data
Contributors: Danny Liu, Aeres Zhou, Baddis Labbedi
Idea: Our idea is to train a multi-layer perceptron (MLP) to predict chaotic systems and retrain it specifically on financial market data of some liquid asset. By first training the model on general chaotic systems, we hope that the model will be able to better generalize its behavior and predictions compared to training it purely on the historical financial data of some security. The chaotic systems the model is trained on can either be self-generated (Lorenz systems, Henón maps, Mackey-Glass equations, etc.) or natural (weather patterns, animal populations, biological systems, etc.). Our choice of using MLPs comes from chaotic systems research by our group member Badis Labbedi. When the predicted future price of the instrument exceeds a set deviation from the actual current value, we will trade against the direction of deviation and exit the position when the price falls back within our bounds. We will track the delta between the initial price and the final price among consistent time intervals (which will stay constant between training and testing).
Synthetic chaos: We can generate our own datasets using pre-established equations that describe chaotic systems, which we can then feed into the model. Additionally, the wide range of possibilities means that there will be more unique data that the model can be trained with, allowing it to increase its accuracy while avoiding overfitting on any particular dataset. We also have a theoretically infinite amount of this data since we can just adjust the starting conditions and other parameters of the equations to generate an entirely new chaotic system.
Natural chaos: We can also find natural instances of chaotic systems and train our model on that as well. These have the added benefit of being more “truly” chaotic compared to data generated by equations, so the results of training on these datasets may be more realistic compared to the synthetic data. The caveats here are that the data may be very messy or difficult to use in its raw form, meaning that we will have to go through a lot of trouble to process it. Additionally, there is not a ton of publicly available data fitting our criteria in general, so the benefits of our approach are rather limited.
Adaptation to markets: After training on natural chaotic datasets and synthetic chaos datasets, we’ll remove the output layer and then create a new output layer which will be trained upon the financial data of some index or other security. We have yet to determine specifically what we want our model to trade, but for it to fit our criteria it should probably be a very liquid asset like an S&P 500 Index tracker or some large-cap stock.
Backtesting/Tuning: In backtesting, we will primarily seek to tune the time period of signal generation where the model performs optimally and the execution threshold for the proportional difference between the modeled price and real price for risk-adjusted returns.
Deployment: We plan on feeding real-time financial data through our model by using API’s like databento or whatever we find to work best. We expect real-world execution to deviate from our predicted p/l as broker execution will be delayed due to hardware and brokerage constraints.
Resources/Proof of Concept
https://www.mdpi.com/2227-7390/12/12/1920 https://link.springer.com/article/10.1007/s43069-021-00071-2 https://www.researchgate.net/publication/380583239_MLP_and_RBF_Algorithms_in_Finance_Predicting_and_Classifying_Stock_Prices_amidst_Economic_Policy_Uncertainty