Titanic Classification

Overview

"This project analyzes the Titanic survival prediction dataset using machine learning techniques, focusing on data preprocessing, visualization, and model selection."

Dataset

Source: Kaggle (Titanic-Dataset)
Features: Passenger details (e.g., Sex, Age, Fare, Pclass)
Target: Survived (0 = No, 1 = Yes)

Methodology

Load the Data: Imported Titanic dataset from Kaggle.
Explore & Visualize: Analyzed distributions and relationships (e.g., Sex vs. Survival).
Prepare Data: Handled missing values, encoded categorical variables, and scaled features.
Model Selection: Compared SGDClassifier, LogisticRegression, and RandomForestClassifier.
Conclusion: Selected tuned RandomForestClassifier (F1 = 0.7971).

Results

Best Model: RandomForestClassifier with class_weight='balanced', max_depth=10, min_samples_split=5, n_estimators=300.
Test F1-Score: 0.7971
Precision: 0.8594
Recall: 0.7432
Key Features: Sex (0.361), log_fare (0.213), log_age (0.178).

Learnings

Importance of handling imbalanced data with class_weight.
Value of feature engineering (e.g., log transformations).
Effectiveness of RandomForest for non-linear relationships.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Titanic-Dataset.csv		Titanic-Dataset.csv
readme.md		readme.md
titanic-classification-random-forest.ipynb		titanic-classification-random-forest.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Titanic Classification

Overview

Dataset

Methodology

Results

Learnings

About

Uh oh!

Releases

Packages

Languages

SyedMaaz25/Titanic-Classification

Folders and files

Latest commit

History

Repository files navigation

Titanic Classification

Overview

Dataset

Methodology

Results

Learnings

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages