Note: This is a prototype and not the actual code used in production.
This MVP fraud detection model serves as a benchmark for evaluating the performance of various machine learning algorithms, including XGBoost, CatBoost, Random Forest, Neural Network, and Logistic Regression. The project comprises three Jupyter notebooks dedicated to data analysis, model development, and performance evaluation. The structure and purpose of each notebook are outlined below:
This notebook contains frequently used functions organized as utility macros. These functions can be reused across different projects to streamline tasks such as data preprocessing, visualization, and performance evaluation.
This notebook focuses on exploratory data analysis (EDA). It includes steps for loading data, cleaning, visualizing, and identifying patterns and trends in the dataset.
This notebook handles the process of building and evaluating machine learning models. It covers tasks like data splitting, model training, hyperparameter tuning, and performance evaluation.
- Python 3.x
- Jupyter Notebook
- Required libraries: Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn
- Clone the repository.
- Install dependencies using
pip install -r requirements.txt
. - Open each notebook and execute the cells in order.
Sean Seunghyun Kim
This project includes three Jupyter notebooks designed for data analysis, model building, and evaluation tasks. The structure and purpose of each notebook are detailed below:
This notebook contains frequently used functions organized as utility macros. These functions can be reused across different projects to streamline tasks such as data preprocessing, visualization, and performance evaluation.
This notebook focuses on exploratory data analysis (EDA). It includes steps for loading data, cleaning, visualizing, and identifying patterns and trends in the dataset.
This notebook handles the process of building and evaluating machine learning models. It covers tasks like data splitting, model training, hyperparameter tuning, and performance evaluation.
- Python 3.x
- Jupyter Notebook
- Required libraries: Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn
- Clone the repository.
- Install dependencies using
pip install -r requirements.txt
. - Open each notebook and execute the cells in order.
Sean Seunghyun Kim
Email: seunghyk@tepper.cmu.edu Phone: (949) 572 7370