Skip to content

Ajeeb-Alameen/machine-learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

📚 Machine Learning Projects

📌 Overview

This repository contains a collection of machine learning projects completed during the Fundamentals of Machine Learning and Advanced Methods in Data Science courses at the German University in Cairo (GUC). The projects cover both clustering and classification domains and leverage real-world and synthetic datasets to demonstrate a wide range of machine learning techniques and evaluation strategies.


📁 Repository Structure

Machine_learning/
├── Clustering_projects/
│   ├── Iris_clustering_project.ipynb
│   ├── Moons_clustering_project.ipynb
│   ├── data/
│   └── README.md
│
├── Classification_projects/
│   └── Detecting Cyber Security Threats using Deep Learning/
│       └── notebook.ipynb + data
│
│   └── Indian Telecom Customer Churn Prediction/
│       └── notebook.ipynb + data
│
│   └── Predict the presence of heart disease/
│       └── notebook.ipynb + data
│
│   └── README.md
│
│
└── README.md

🔍 Project Summaries

1️⃣ Clustering Projects Clustering Projects

🌸 Iris Clustering

  • Objective: Group different iris species using unsupervised learning.
  • Techniques: K-Means, Hierarchical Clustering, DBSCAN.
  • Evaluation Metrics: Silhouette Score, Inertia.

🌙 Moons Dataset Clustering

  • Objective: Evaluate clustering methods on a synthetic, non-linear dataset.
  • Techniques: K-Means, Hierarchical Clustering, DBSCAN.
  • Evaluation Metrics: Cluster shape visualizations, Silhouette Score.

2️⃣ Classification Projects Classification_projects

🛡️ Detecting Cyber Security Threats using Deep Learning

  • Objective: Classify malicious vs. benign network events using PyTorch.
  • Techniques: FeedForward Neural Network(FFNN) with Dropout, Early Stopping, Imbalanced Data Handling.
  • Evaluation: Accuracy, Precision, Recall, F1-score, Confusion Matrix, ROC Curve, PR Curve.

📞 Indian Telecom Customer Churn Prediction

  • Objective: Predict customer churn using usage patterns and demographic data.
  • Techniques: Logistic Regression, Random Forest, SVM + SMOTE.
  • Evaluation: F1-Score (especially on minority churn class), Precision, Recall, Confusion Matrix.

❤️ Predicting Heart Disease

  • Objective: Predict the presence of heart disease based on clinical parameters.
  • Techniques: K-Nearest Neighbors, Logistic Regression, Decision Trees.
  • Evaluation: Accuracy, Precision, Recall, F1-score, Confusion Matrix, ROC Curve.

📄 For more detailed project-level documentation, see the README in each subfolder:


🛠 Technologies Used

  • Language: Python
  • Tools & Libraries:
    • pandas, numpy, matplotlib, seaborn
    • scikit-learn, PyTorch (for DL project)
  • Environment: Jupyter Notebook

🚀 How to Run the Projects

  1. Clone the repository

    git clone https://github.com/Ajeeb-Alameen/Machine_learning.git
    cd Machine_learning
  2. Navigate to a project and open the notebook

    cd Clustering_projects  # or Classification_projects
    jupyter notebook <notebook_name>.ipynb

📈 Project Flow

Each notebook typically includes:

  • Exploratory Data Analysis (EDA)
  • Model Building & Evaluation
  • Performance Metrics & Visualizations
  • Final Discussion & Insights

✅ Next Steps

  • Apply hyperparameter tuning and cross-validation
  • Explore ensemble and boosting methods (e.g., XGBoost, LightGBM)
  • Expand datasets and incorporate feature selection
  • Improve generalization using regularization techniques

🔗 Links


About

This repository contains machine learning projects from the Fundamentals of Machine Learning course at GUC.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published