Fraud Detection Project

This project analyzes and detects fraudulent transactions using a dataset of financial transactions. The workflow includes data exploration, visualization, feature engineering, and building a machine learning model to classify transactions as fraudulent or not. The trained model is deployed using Streamlit for interactive visualization and prediction.

Dataset

The dataset is loaded from an Excel file (Fraud.xlsx).
Key columns include transaction type, amount, balances, and fraud indicators.

Steps Performed

Data Exploration & Cleaning
- Checked for missing values and data types.
- Explored class distribution for isFraud and isFlaggedFraud.
- Visualized transaction types and fraud rates.
Feature Engineering
- Created new features such as balance differences.
- Filtered and analyzed suspicious patterns (e.g., zero balances after transfer).
Visualization
- Plotted distributions of transaction amounts.
- Visualized fraud rates by transaction type and over time.
- Correlation heatmaps for key features.
Model Building
- Selected features and split data into training and test sets.
- Preprocessed data using scaling and one-hot encoding.
- Built a pipeline with logistic regression (class weight balanced).
- Evaluated model with classification report and confusion matrix.
Model Saving
- Saved the trained pipeline using joblib for future use.
Deployment with Streamlit
- Developed a Streamlit app for interactive visualization and prediction.
- Users can upload transaction data and get real-time fraud predictions.
- Visualizations of transaction patterns and model results are available in the app.

Model Performance

Accuracy: ~ 94%

Requirements

Python 3.x
pandas
numpy
matplotlib
seaborn
scikit-learn
joblib
streamlit

Install dependencies with:

pip install pandas numpy matplotlib seaborn scikit-learn joblib streamlit

Usage

Place Fraud.xlsx in the specified directory.
Run the notebook Fraud_Detection.ipynb step by step to train and save the model.
Start the Streamlit app:
```
streamlit run app.py
```
Use the web interface to visualize data and make predictions.

Results

The model provides classification metrics and a confusion matrix for fraud detection.
Visualizations help understand transaction patterns and fraud distribution.
The Streamlit app allows for interactive exploration and prediction.

License

This project is for educational purposes.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Fraud.xlsx		Fraud.xlsx
Fraud_Detection.ipynb		Fraud_Detection.ipynb
README.md		README.md
fraud_detection.py		fraud_detection.py
fraud_detection_pipeline_model.pkl		fraud_detection_pipeline_model.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Fraud Detection Project

Dataset

Steps Performed

Model Performance

Requirements

Usage

Results

License

About

Uh oh!

Releases

Packages

Languages

Manishdebnath99/Fraud-Detection

Folders and files

Latest commit

History

Repository files navigation

Fraud Detection Project

Dataset

Steps Performed

Model Performance

Requirements

Usage

Results

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages