Welcome to the Exercise Set! This repository contains a collection of programming exercises aimed at improving your data analysis and machine learning skills. The exercises are implemented in Jupyter notebooks and cover different datasets and techniques.
├── README.md
├── glass-imbalanced.ipynb
├── hotel-reservations.ipynb
└── skillmaster.ipynb
This notebook focuses on analyzing the Glass Imbalanced dataset. It includes:
- Data loading and preprocessing
- Handling missing values
- Exploratory Data Analysis (EDA) using seaborn and matplotlib
- Distribution visualization of refractive index by class
This notebook processes and analyzes hotel booking data. It includes:
- Data preprocessing (handling missing values, encoding categorical variables)
- Exploratory Data Analysis (EDA)
- Logistic Regression for predicting booking cancellations
- Performance evaluation using confusion matrix and ROC curve
This notebook analyzes Udemy course data. It covers:
- Data cleaning and feature engineering
- Encoding categorical variables
- Classification using K-Nearest Neighbors (KNN)
- Model evaluation using accuracy metrics and classification reports
To run the notebooks, ensure you have the following dependencies installed:
pip install numpy pandas seaborn matplotlib scikit-learn
- Clone the repository:
git clone https://github.com/salimawad85/git-kaggle.git
- Navigate to the directory:
cd salimawad85-git-kaggle
- Open Jupyter Notebook:
jupyter notebook
- Run the notebooks interactively.
Feel free to contribute by submitting pull requests or reporting issues.
This project is open-source and available under the MIT License.