Nonlinear Factor Models in Financial Markets: A Comparative Study of Autoencoders and PCA

This project explores the use of autoencoders as a nonlinear alternative to Principal Component Analysis (PCA) for dimensionality reduction, clustering, and anomaly detection in financial return data. Using historical data from Yahoo Finance and Alpha Vantage, we evaluate how effectively these models extract latent market factors and uncover nonlinear structures.

Core Objectives

Compare PCA and autoencoders for reconstructing standardized daily returns
Evaluate performance using R² and Mean Squared Error (MSE)
Apply K-Means clustering to latent features from both models
Detect anomalies using both classical statistics (Z-score, IQR) and machine learning (Isolation Forests)

Datasets Used

S&P 500 (2012–2025) from Yahoo Finance
Russell 3000 (2015–2025) from Alpha Vantage API (TIME_SERIES_DAILY)

Key Findings

PCA is highly effective for capturing linear structure, but autoencoders approach or exceed its performance on nonlinear patterns as model capacity increases.
Autoencoder-based clustering reveals latent groupings (e.g., NFLX and TSLA) not found in PCA results.
Anomaly detection is more robust when incorporating autoencoder reconstruction loss and multivariate Isolation Forests.

Report

The full project report is written in LaTeX and available in report.pdf. It includes methodology, figures, and citations.

Citation

If you found this useful, consider citing the project or referencing the GitHub repo directly.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
assets		assets
data		data
notebooks		notebooks
report		report
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Nonlinear Factor Models in Financial Markets: A Comparative Study of Autoencoders and PCA

Core Objectives

Datasets Used

Key Findings

Report

Citation

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

yi-json/auto-returns

Folders and files

Latest commit

History

Repository files navigation

Nonlinear Factor Models in Financial Markets: A Comparative Study of Autoencoders and PCA

Core Objectives

Datasets Used

Key Findings

Report

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages