Skip to content

A complete data cleaning and preprocessing project using the Titanic dataset from Kaggle. Includes missing value handling, outlier detection, feature engineering, and transformation — prepared for machine learning.

License

Notifications You must be signed in to change notification settings

emreeilhan/titanic-data-cleaning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Titanic Data Cleaning Project 🚢

A complete data cleaning and preprocessing project using the Titanic dataset from Kaggle.


🔍 Project Goals

This project focuses on preparing the Titanic dataset for machine learning.
We aim to clean, transform, and engineer features using best practices.


✅ Steps Completed

  • Missing value handling (Age, Cabin, Embarked)
  • Outlier detection and treatment
  • Feature extraction (e.g. extracting titles from names)
  • Encoding categorical variables (Sex, Embarked)
  • Feature scaling (Standardization and Normalization)

📊 Libraries Used

  • pandas
  • numpy
  • matplotlib
  • seaborn
  • sklearn

📁 Project Structure

titanic-data-cleaning/
├── data/
│   ├── train.csv
│   ├── test.csv
├── notebooks/
│   └── 01_data_cleaning.ipynb
├── README.md
└── requirements.txt

📌 Notebook Preview

You can follow the full data cleaning process in the notebook:
01_data_cleaning.ipynb


🧠 What You Will Learn

  • Exploratory Data Analysis (EDA)
  • Handling null and duplicate values
  • Visualizing distributions and outliers
  • Preparing a dataset for machine learning

📦 Coming Next:

  • Classification models on cleaned data
  • Model evaluation and tuning
  • Submission to Kaggle!

👤 Author


📜 License

This project is licensed under the MIT License.

About

A complete data cleaning and preprocessing project using the Titanic dataset from Kaggle. Includes missing value handling, outlier detection, feature engineering, and transformation — prepared for machine learning.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published