This repository contains the solved notebook of the competition. This will help beginners to understand the workflow of a machine learning problem. Machine Learning is not just about learning the algorithms, It consists of some crucial steps.
- Data Preparation
- Data Scaling
- Dimensionality Reduction
- Model choosing
- Training and Testing Model
- Hyperparamter tuning [Grid Search, Random Search]
- Evaluation [
confusion_matix
,F1_score
,precision
]- Deployment [out of the scope of this repository]
You can Read the whole problem description in kaggle : Kaggle|Spaceship Titanic Competition There are features [passenger details], we have to predict whether a passenger has transported or not. Basically this is a binary classification problem which needs advance feature engineering skills.
🥊 Challenges:
- Have Categorical data which affect the performance if we just
LabelEncode()
it. - Variance of the features are very uneven. Scaling the data is very necessary.
- Irrelivant columns of data like
PassengerId
have to be removed. - Other Challenges: Intermediate to advance feature engineering skills needed
👉 Click on the notebook spaceship-titanic.ipynb
and click on the open in kaggle
button.
👉 Or if you want to open it in google colab, Click on the open in colab
button on top of the notebook or on top of this README file.
When opening in colab, dataset have to be downloaded and uploaded on google colab manually.
- Found a mistake?
- Improved Accuracy of the model?
- Any suggestion related to the notebook or the workflow
- Or any other types of contribution will be appriciated.
👉 In the notebook I've provided detailed codes and concepts. If you like it please give a star ⭐️
🧑🏻💻 My Profiles: