Skip to content

nadirg2/fraud-detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Fraud Detection using Machine Learning

This repository contains a Jupyter Notebook demonstrating a machine learning approach to detect fraudulent activities. The project focuses on analyzing patterns in financial transactions and building a model to identify potential fraud.

Project Overview

Fraud detection is a critical problem in many industries, especially finance and e-commerce. This project leverages machine learning techniques to build a classification model capable of distinguishing between fraudulent and non-fraudulent transactions.

The dataset and methods used are designed to explore key concepts such as feature engineering, model evaluation, and interpretability in fraud detection.

Features

  • Data Preprocessing: Cleaning and preparing data for analysis.
  • Feature Engineering: Selecting and creating features that improve model performance.
  • Modeling: Training and evaluating machine learning models.
  • Evaluation Metrics: Using precision, recall, F1 score, and AUC to assess performance.
  • Visualization: Exploring data insights and results through visual tools.

Notebook Content

The notebook includes the following steps:

  1. Data Loading: Importing the dataset for analysis.
  2. Exploratory Data Analysis (EDA): Visualizing the data distribution and identifying patterns.
  3. Data Preprocessing: Handling missing values, scaling features, and encoding categorical variables.
  4. Model Training: Training machine learning models such as logistic regression, decision trees, and random forests.
  5. Model Evaluation: Assessing the models using appropriate metrics and selecting the best-performing one.
  6. Fraud Detection Insights: Interpreting the results and understanding the key factors influencing fraud detection.

How to Run

  1. Clone this repository:
    git clone https://github.com/nadirg2/fraud-detection.git
  2. Install the required libraries:
    pip install -r requirements.txt  
  3. Open the Jupyter Notebook:
    jupyter notebook notebook.ipynb  
    
  4. Run the cells in the notebook to reproduce the results.

Technologies Used

Python
Pandas, NumPy for data manipulation
Scikit-learn for machine learning
Matplotlib, Seaborn for visualization

Used Techniques Overview

  • Isolation Forest: An unsupervised anomaly detection method that isolates outliers in the data by randomly selecting features and splitting the data.

  • DBSCAN (Density-Based Spatial Clustering of Applications with Noise): A clustering algorithm that groups together points that are closely packed while marking points in low-density regions as outliers.

  • SVM (Support Vector Machines): A supervised learning method effective for binary classification tasks, often used with kernel tricks to handle non-linear data.

  • PCA (Principal Component Analysis): A dimensionality reduction technique that projects data onto the directions of maximum variance to simplify analysis while preserving key information.

  • t-SNE (t-Distributed Stochastic Neighbor Embedding): A non-linear dimensionality reduction algorithm optimized for visualizing high-dimensional data in 2D or 3D.

  • UMAP (Uniform Manifold Approximation and Projection): A fast and flexible dimensionality reduction method that preserves local and global data structure better than t-SNE in many cases.

Results

The final model achieves a high precision and recall, making it effective at identifying fraudulent transactions while minimizing false positives. Future Work

Explore deep learning techniques for improved accuracy.
Incorporate additional datasets for enhanced model robustness.
Deploy the model as a web application for real-time fraud detection.

Contributing

Contributions are welcome! If you have suggestions or improvements, feel free to open an issue or submit a pull request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

About

Поиск аномалий в банковских транзакциях

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published