Skip to content

Build 2 machine learning models: a regression model to predict Black-Scholes option price and a classification model to predict whether Black-Scholes model overestimates or underestimate the actual option price.

Notifications You must be signed in to change notification settings

tomvdo29usc/Call_Option_Pricing_Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

About The Project

Build 2 machine learning models: a regression model to predict Black-Scholes option price and a classification model to predict whether Black-Scholes model overestimates or underestimate the actual option price.

Tools: Python (NumPy, Pandas, Scikit-learn, ...), MS Excel, PowerPoint

Skills: Exploratory Data Analysis, Applied Statistics, Data Visualization, Machine Learning, Feature Engineering, Project Management, Team Collaboration, Business Communication,

1. Exploratory Data Analysis

The dataset contains 1,680 records and 6 columns as followings: Untitled

1.1 Data Quality Summary Table

image

Note

There are missing values and outliers in some fields of the data by looking at % Populated and Min, Max, and Mean. Therefore, we will considering dropping missing values and outliers before modeling step.

1.2 Data Cleaning

We decided to removed any record with missing value and with outliers that fell beyond 3 standard deviation from the mean of any field. Below is the detailed view of detected records with missing values and/or outliers in any field:

image

Note

We only dropped 7 over 1,680 records from the original data, which won’t be significant. There are 1,673 records after the data cleaning step.

1.3 Visualization of each Field Before and After Cleaning the Data

image image

Note

The distribution of field Stock Price (S) and field Time to Maturity (t) become clearer after dropping outliers and missing values.

1.4 Feature Engineering

image

2. Model Development

Our objective for model exploration was to experiment with different models to select the best model for both regression and classification problems.

  • In the regression problem, we wanted to train a model that can accurately predict the option price.
  • In the classification problem, we wanted to build a model that can accurately classify whether using the Black-Scholes algorithm would underestimate or overestimate the actual option price.

We tried different combinations of these tuning hyperparameters to find the best performing models:

image

Below is our method to evaluate and select the best model for each problem:

image

2.1 Regression Model Results

image

Note

In general, non-linear models outperformed the baseline Linear Regression model significantly. Gradient Boosting Regression model performs the best with the highest and the least variability in testing and cross validation R-squared score. This means that this model is more consistent and robust.

2.2 Classification Model Results

image

Note

In general, non-linear models outperformed the baseline Logistic Regression by a little. Logistic Regression shows less sign of overfitting comparing to other models. CatBoost model performs the best with the highest and the least variability in testing and cross validation accuracy score. This means that this model is quite more accurate and robust than other models.

2.3 Baseline Models

2.4 Final Model Selection

image

3. Business Understandings

Some business understandings need to be considered when predicting option values:

  1. Accurately predicting European call option values is essential to achieve the most optimal financial outcomes, but the interpretation is also important for decision-making. Understanding the relationships between predictor variables and response variables can provide valuable insights to guide investment strategies, risk management, or policy decisions.
  2. Machine learning models can outperform the Black-Scholes model in predicting option prices due to their flexibility and adaptability. While the Black-Scholes model, primarily used in European option trading, relies on a fixed set of assumptions and features, machine learning models can capture complex patterns, nonlinear relationships, varying volatility, changing interest rates, and non-continuous trading scenarios. This enables machine learning models to achieve higher accuracy and greater practicality in real-world trading environments.
  3. Applied to predict Tesla’s option price? Tesla is very unique compared to other S&P 500 stock options. Due to its high volatility, the CEO’s sentiments, emerging industry dynamics, and growth expectations, predicting Tesla’s call option price using these existing patterns would be very challenging or yield poor performance.

About

Build 2 machine learning models: a regression model to predict Black-Scholes option price and a classification model to predict whether Black-Scholes model overestimates or underestimate the actual option price.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published