This repository contains a collection of machine learning projects completed during the Fundamentals of Machine Learning and Advanced Methods in Data Science courses at the German University in Cairo (GUC). The projects cover both clustering and classification domains and leverage real-world and synthetic datasets to demonstrate a wide range of machine learning techniques and evaluation strategies.
Machine_learning/
├── Clustering_projects/
│ ├── Iris_clustering_project.ipynb
│ ├── Moons_clustering_project.ipynb
│ ├── data/
│ └── README.md
│
├── Classification_projects/
│ └── Detecting Cyber Security Threats using Deep Learning/
│ └── notebook.ipynb + data
│
│ └── Indian Telecom Customer Churn Prediction/
│ └── notebook.ipynb + data
│
│ └── Predict the presence of heart disease/
│ └── notebook.ipynb + data
│
│ └── README.md
│
│
└── README.md
1️⃣ Clustering Projects Clustering Projects
- Objective: Group different iris species using unsupervised learning.
- Techniques: K-Means, Hierarchical Clustering, DBSCAN.
- Evaluation Metrics: Silhouette Score, Inertia.
- Objective: Evaluate clustering methods on a synthetic, non-linear dataset.
- Techniques: K-Means, Hierarchical Clustering, DBSCAN.
- Evaluation Metrics: Cluster shape visualizations, Silhouette Score.
2️⃣ Classification Projects Classification_projects
- Objective: Classify malicious vs. benign network events using PyTorch.
- Techniques: FeedForward Neural Network(FFNN) with Dropout, Early Stopping, Imbalanced Data Handling.
- Evaluation: Accuracy, Precision, Recall, F1-score, Confusion Matrix, ROC Curve, PR Curve.
- Objective: Predict customer churn using usage patterns and demographic data.
- Techniques: Logistic Regression, Random Forest, SVM + SMOTE.
- Evaluation: F1-Score (especially on minority churn class), Precision, Recall, Confusion Matrix.
- Objective: Predict the presence of heart disease based on clinical parameters.
- Techniques: K-Nearest Neighbors, Logistic Regression, Decision Trees.
- Evaluation: Accuracy, Precision, Recall, F1-score, Confusion Matrix, ROC Curve.
- Language:
Python
- Tools & Libraries:
pandas
,numpy
,matplotlib
,seaborn
scikit-learn
,PyTorch
(for DL project)
- Environment: Jupyter Notebook
-
Clone the repository
git clone https://github.com/Ajeeb-Alameen/Machine_learning.git cd Machine_learning
-
Navigate to a project and open the notebook
cd Clustering_projects # or Classification_projects jupyter notebook <notebook_name>.ipynb
Each notebook typically includes:
- Exploratory Data Analysis (EDA)
- Model Building & Evaluation
- Performance Metrics & Visualizations
- Final Discussion & Insights
- Apply hyperparameter tuning and cross-validation
- Explore ensemble and boosting methods (e.g., XGBoost, LightGBM)
- Expand datasets and incorporate feature selection
- Improve generalization using regularization techniques
- 📂 GitHub Repository: Machine Learning Repo
- ✍ Author: Ajeeb Alameen
- 📧 Email: ajeebizzalameen@gmail.com
- 🔗 LinkedIn: Ajeeb Alameen