Phishing-website-detection Project Overview

Classified phishing websites using one-hot encoded parameters and compared the predicted result from Logistic Regression, Random Forest, XGBoost models. Improved accuracy from 85% to 95% with hyperparameter tuning with grid and random search.

Code and Resourcess used

Python version: 3.7 Packages: pandas, numpty, sklearn, matplotlib, seaborn, Other resources...

Data

Columns

Data Cleaning

After acquiring the data, I needed to clean it up so that it was usable for our model. I made the following changes and created the following variables:

EDA

Model Building

First, I transformed the categorical variables into dummy variables. I also split the data into train and tests sets with a test size of 20%.

I tried three different models and evaluated them using Mean Absolute error. I choose MAE because

The three different models I tried:

Model Performance

The random forest model far outperformed the other approcahes on the test and validaiton sets. RF: MAE =

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Group3_ClassifyingPhishingWebsites.pdf		Group3_ClassifyingPhishingWebsites.pdf
LogisticRegression.ipynb		LogisticRegression.ipynb
MLP.ipynb		MLP.ipynb
Preprocessing.ipynb		Preprocessing.ipynb
README.md		README.md
RandomForest grid.ipynb		RandomForest grid.ipynb
RandomForest.ipynb		RandomForest.ipynb
XGBoost-random.ipynb		XGBoost-random.ipynb
XGBoost.ipynb		XGBoost.ipynb
combined_dataset.csv		combined_dataset.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Phishing-website-detection Project Overview

Code and Resourcess used

Data

Data Cleaning

EDA

Model Building

Model Performance

Productionization

About

Uh oh!

Releases

Packages

Languages

parth0420/Phishing-website-detection

Folders and files

Latest commit

History

Repository files navigation

Phishing-website-detection Project Overview

Code and Resourcess used

Data

Data Cleaning

EDA

Model Building

Model Performance

Productionization

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages