Framework for ML in Finance

📚 Background

Machine learning needs in the finance field: Machine learning algorithms can be used in many aspects such as risk management, asset management, market analysis and trading strategies, and have become a key tool in the field of finance and trading.

Goals and Significance

Goals

Develop a custom Python machine learning framework
Intergrate basic machine learning algorithms implemented through packages such as numpy
Evaluate the algorithms in a variety of financial scenarios

Significance

Develop a machine learning framework for the financial field to provide solutions more suitable for financial problems instead of just relying on general machine learning libraries
A framework for machine learning beginners to quickly get started with basic machine learning algorithms

📈 Applications

Problem	Solution	Algorithm
Risk Management	Classification, Regression	Principle Component Analysis(PCA), Adaboost
Financial Fraud Detection	Classification, Clustering	K-Nearest Neighbor(KNN), K-Means, DBSCAN
Customer Relationship Management	Classification	Naive Bayes, Adaboost
Financial Forecast	Regression	Support Vector Machine(SVM)
Investment and Asset Management	Regression	Linear Regression

🎮 Other Machine learning Algorithms

Multilayer Perceptron, used for classification / regression, can be applied to promotion, fraud detection and so on.
Decision Tree, used for classification, can be applied to direct marketing, risk management and so on.
...

Main Content

✨Here is an overview of this framework.

🔧 Setup

Install.

pip install -i requirements.txt

🕹️ Run

Init weight. Create a model config yaml file under ./config/, which indicate the initial weight name and value of the model. Examples can be found in the existing config directory.
Write the bash command under ./scripts/. Examples can be found in the existing config directory.

bash scripts/run_svm.sh

During running, you need to enter natural language that describe how you would like to preprocess the data. After model training, you also need to enter the evaluation metric and the visualization method you would use.

📁 Main File Sturcture

├── main.py
├── pipeline.py
├── ...
├── /dataset/
│  ├── base.py
│  ├── risk_management.py
│  ├── investment_and_asset_management.py
│  ├── ...
│  └── finance_prediction.py
├── /algorithms/
│  ├── base.py
│  ├── svm.py
│  ├── linear_regression.py
│  ├── ...
│  └── pca.py
├── /evaluate/
│  └── utils.py
├── /visualization/
│  └── utils.py

📊 Results and Evaluation

Common financial problems that can be solved by applying machine learning methods can be categorized into the following five categories: financial fraud detection, customer relationship management, financial forecasting, risk management, investment and asset management. In each class of problems, a suitable dataset as well as a reasonable method was selected for testing and validation, and the results are as follows.

Financial fraud detection

Dataset: Credit Card Fraud Detection Dataset 2023
Algorithms: k-Nearest Neighbor

The K-nearest neighbors (KNN) algorithm applied to classify credit card fraud dataset achieved an accuracy of 78.9%. With precision values of 74.2% for identifying non-fraudulent transactions and 86.0% for detecting fraudulent transactions, the algorithm demonstrates good performance in accurately predicting both classes. The recall values of 88.9% for non-fraudulent transactions and 68.9% for fraudulent transactions indicate that the algorithm effectively captures a high proportion of both classes. Overall, the KNN algorithm shows promising performance in accurately classifying credit card fraud, with balanced precision and recall values.

Customer relationship management

Dataset: Credit Card Customer Churn Prediction
Algorithms: Naive Bayes

The Naive Bayes algorithm for predicting customer churn achieved an accuracy of 82.1%, with precision values of 85.1% for non-churned customers and 57.3% for churned customers. However, the recall values were 94.2% for non-churned customers and 32.1% for churned customers, indicating room for improvement in accurately identifying churned customers.

Financial forecasting

Dataset: Stock Price of Netflix
Algorithms: SVM

Given history volumns, predict if a stock’s open price is higher or lower than the close price. The label is 0 if the close price is higher than the open price, and is 1 if the open price is higher than the close price. Predictions were 56% accurate. However, all predictions are label 1, might be overfitted.

Risk management

Dataset: creditcard_2023
Algorithms: Principal Component Analysis(PCA)

The Silhouette Coefficent is 0.1366, indicating that the data after principal component analysis (PCA) has a certain degree of clustering structure.The two principal components together retained approximately 81.44% of the total variance, providing a fairly good dimensional compression of the original data.

Investment and asset management

Dataset: boston_house_prices
Algorithms: Linear Regression

This output provides a prediction model based on this dataset.Meanwhile, it gives 4 evaluation function.Mse=54.39 MAE=4.97 RMSE=7.37 R2=0.18. These four evaluation functions indicate that the predicted results are very close to real data, and the model performs well.

Future Work

Develop interactive UI
Intergrate More algorithms
Encapsulate the data preprocessing process to reduce the cost of getting started

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Framework for ML in Finance

📚 Background

Goals and Significance

Goals

Significance

📈 Applications

🎮 Other Machine learning Algorithms

Main Content

🔧 Setup

🕹️ Run

📁 Main File Sturcture

📊 Results and Evaluation

Financial fraud detection

Customer relationship management

Financial forecasting

Risk management

Investment and asset management

Future Work

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
algorithms		algorithms
assets		assets
cache		cache
config		config
data		data
dataset		dataset
evaluate		evaluate
scripts		scripts
visualization		visualization
weights		weights
README.md		README.md
__init__.py		__init__.py
main.py		main.py
openai_chat.py		openai_chat.py
pipeline.py		pipeline.py
prompt.py		prompt.py
requirements.txt		requirements.txt
utils.py		utils.py

pooruss/ML-Framework-for-Diverse-Applications-in-Trading-and-Finance

Folders and files

Latest commit

History

Repository files navigation

Framework for ML in Finance

📚 Background

Goals and Significance

Goals

Significance

📈 Applications

🎮 Other Machine learning Algorithms

Main Content

🔧 Setup

🕹️ Run

📁 Main File Sturcture

📊 Results and Evaluation

Financial fraud detection

Customer relationship management

Financial forecasting

Risk management

Investment and asset management

Future Work

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages