This project is a machine learning-based system that predicts whether a passenger on the Titanic would survive or not based on various features such as age, gender, ticket class, fare, and embarkation point.
The model is built using Python, pandas, scikit-learn, and Streamlit for deployment, providing an interactive web interface for predictions.
- Data Preprocessing: Handles missing values and categorical variables.
- Exploratory Data Analysis (EDA): Visualizes key patterns and insights.
- Feature Engineering: Transforms data for better model performance.
- Machine Learning Model: Implements Logistic Regression for prediction.
- Web App with Streamlit: Allows users to input passenger details and get survival predictions.
TitanicSurvivalSystem/
│── data/
│ ├── train.csv
│── app.py
│── README.md
- data/: Contains Titanic dataset.
- app.py: Streamlit app for prediction.
- run_streamlit.py: Script to launch the web app.
- README.md: Project documentation.
git clone https://github.com/yourusername/TitanicSurvivalSystem.git
cd TitanicSurvivalSystem
python -m venv venv # Windows
source venv/bin/activate # macOS/Linux
venv\Scripts\activate # Windows
pip install -r requirements.txt
The dataset contains:
- PassengerId (Unique ID)
- Survived (Target: 1 = Survived, 0 = Not Survived)
- Pclass (Ticket class)
- Name, Sex, Age
- SibSp, Parch (Family relations)
- Ticket, Fare, Cabin, Embarked
- Age: Filled with median age.
- Embarked: Filled with mode (most common value).
- Cabin: Dropped due to excessive missing values.
- Sex: Converted to numeric (0 = Male, 1 = Female).
- Embarked: One-hot encoding for categorical values (C, Q, S).
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
model = LogisticRegression(max_iter=1000)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Model Accuracy: {accuracy:.2f}")
- Accuracy: 81%
- Precision, Recall, F1-Score: Evaluated for class imbalance.
streamlit run app.py
python -m streamlit run app.py
Once running, open in your browser.
- Implement more models (Random Forest, SVM, XGBoost).
- Add a graphical dashboard for EDA.
- Deploy the app using Heroku or AWS.
This project is licensed under the MIT License - see the LICENSE file for details.
Feel free to contribute by submitting a pull request or reporting issues!
📧 Email: ramnrngupta@gmail.com 📌 Linkedin: Ram Narayan Gupta