Skip to content

gill-0/Identifying-Fraud-at-Enron

Repository files navigation

Identifying-Fraud-at-Enron

Introduction

The goal of the Enron Case study is to analyze a dataset composed of financial and email features from Enron employees that were employed during the Enron scandal as well as other persons that did business with Enron. I will test various supervised machine learning algorithms in order to generalize patterns and be able to predict employees who may be fraudulent, indicated by the label POI – person of interest.

Final Analysis

Below is a blocks link that explains my analysis and results.

http://bl.ocks.org/gill-0/raw/a44ff333180fb13d460ee57c0345f0e4/

Files

Presentation of process and findings

Enron_fraud.html

Main script to create classifier

poi_id_final.py

Discover and graph outliers

final_outliers.py

Initial exploration and cleaning of data

explore_final.py

Creates two email features for testing in classifier

email_fraction.py

Udacity file provided to format and split data

feature_format.py

Udacity file provided to test performance of ML algorithm

tester.py

About

Machine Learning Classification on Unbalanced Real World Dataset

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published