🧠 Income Classification with Census Data

This project implements predictive models to classify adult income based on the U.S. Census Bureau data. The goal is to predict whether a person earns more than $50K per year using machine learning techniques.

📁 Files in This Repository

File	Description
`FinalProject.ipynb`	Jupyter Notebook with complete code and outputs
`README.md`	Project overview and usage instructions
`adult-dataset.csv`	The data is in the file "adult-dataset.csv". It was extracted from the census bureau database, found at: http://www.census.gov/ftp/pub/DES/www/welcome.html

📊 Dataset Overview

The dataset contains demographic and employment-related attributes for U.S. adults. The target variable is Income:

<=50K or >50K

Features include:

Age (int)
Work Class (categorical)
Education (categorical)
Marital Status (categorical)
Occupation (categorical)
Race (categorical)
Sex (binary)
Hours per week (int)

🧹 Part A: Data Cleaning

Handled missing data by removing or imputing invalid entries
Converted categorical variables using one-hot encoding
Removed irrelevant columns
Ensured numeric data for model compatibility

📉 Part B: Dimensionality Reduction

Applied both SVD and PCA to reduce dataset dimensions.

✅ Results:

Explained Variance: PCA components capturing >90% of varianc

🤖 Part C: Model Training & Evaluation

1. Multi-Layer Perceptron (Neural Network)

Built with MLPClassifier
Evaluated using Confusion Matrix & Classification Report

2. Logistic Regression

Applied using Scikit-learn
Good baseline model

3. Naïve Bayes (GaussianNB)

Fast and interpretable model

4. K-Means Clustering

Unsupervised clustering excluding Income

🧪 Results Summary

Model	Accuracy	Precision	Recall	F1 Score
MLP	✅ High	✅ High	✅ High	✅ High
Logistic Regression	Moderate	Moderate	Moderate	Moderate
Naïve Bayes	Moderate	Lower	Moderate	Moderate
K-Means	-	-	-	- (unsupervised)

🚀 Getting Started

🔧 Install Dependencies

pip install numpy pandas matplotlib seaborn scikit-learn

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
FinalProject.ipynb		FinalProject.ipynb
README.md		README.md
adult-entire dataset.csv		adult-entire dataset.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠 Income Classification with Census Data

📁 Files in This Repository

📊 Dataset Overview

Features include:

🧹 Part A: Data Cleaning

📉 Part B: Dimensionality Reduction

✅ Results:

🤖 Part C: Model Training & Evaluation

1. Multi-Layer Perceptron (Neural Network)

2. Logistic Regression

3. Naïve Bayes (GaussianNB)

4. K-Means Clustering

🧪 Results Summary

🚀 Getting Started

🔧 Install Dependencies

About

Uh oh!

Releases

Packages

Languages

Yashasvi1714/Predictive_modelling-on-cenus-data

Folders and files

Latest commit

History

Repository files navigation

🧠 Income Classification with Census Data

📁 Files in This Repository

📊 Dataset Overview

Features include:

🧹 Part A: Data Cleaning

📉 Part B: Dimensionality Reduction

✅ Results:

🤖 Part C: Model Training & Evaluation

1. Multi-Layer Perceptron (Neural Network)

2. Logistic Regression

3. Naïve Bayes (GaussianNB)

4. K-Means Clustering

🧪 Results Summary

🚀 Getting Started

🔧 Install Dependencies

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages