Skip to content

This project applies unsupervised machine learning to analyze and recommend athletes for the Paris 2024 Olympic & Paralympic Games. Using clustering and similarity-based methods, we built an athlete recommender system that identifies similar athletes based on key performance, demographic, and social influence attributes.

Notifications You must be signed in to change notification settings

Majdi21926/Unsupervised-ML-Athlete-Recommender-Analytics-for-Paris-2024

Repository files navigation

Unsupervised ML: Athlete Recommender & Analytics for Paris 2024

📌 Project Overview

This project applies unsupervised machine learning to analyze and recommend athletes for the Paris 2024 Olympic & Paralympic Games. Using clustering and similarity-based methods, we built an athlete recommender system that identifies similar athletes based on key performance, demographic, and social influence attributes.

🚀 Features

  • Athlete Similarity Recommender: Finds and suggests athletes with similar profiles.
  • Data Preprocessing & Normalization: Encoding, scaling, and handling categorical attributes.
  • Chatbot Integration: Users can query the system using text inputs (e.g., "Football male athletes for JO2024 with Instagram influence >10K").
  • Exploratory Data Analysis (EDA): Understanding athlete distribution based on multiple attributes.
  • Visualization: Graphical representations of athlete clusters and similarities.

📊 Dataset

The dataset contains 600 athletes with 82 features, including:

  • Demographics: Age, Gender
  • Performance Metrics: Medals (Gold, Silver, Bronze)
  • Competition Status: Participation in Paris 2024 (Qualified, Non-Eligible, etc.)
  • Sport & Para-sport Category: Encoded categorical variables
  • Influence Metrics: Social media following (normalized values)

🛠 Methodology

1️⃣ Data Preprocessing & Encoding

  • Encoded categorical variables using the one-hot encoding (e.g., sports, gender, status).
  • Normalized numerical features using MinMaxScaler (Age, Followers, etc.).
  • Kept only relevant and non-missing features for scoring and similarity analysis.

2️⃣ Feature Engineering & Similarity Computation

  • Built a feature matrix (X) combining sports, performance, and influence data.
  • Used cosine similarity to measure athlete likeness.
  • Identified the top 5 most similar athletes for any given athlete.

3️⃣ Chatbot for Athlete Search & Recommendations

  • Developed a query-based chatbot that filters athletes based on textual queries.
  • Users can search for athletes using conditions like sport type, gender, status, and social influence.
  • Returns a ranked list of matching athletes with their details.

📈 Visualization & Analysis

  • Heatmaps to illustrate athlete similarity scores.
  • Clustering of athletes based on performance and social influence.
  • Graphical representation of top athlete recommendations.

🏆 Example: Finding Similar Athletes

For example, given Hélène Noesmoen, the system recommends the following top 5 similar athletes:

  1. Axel Mazella
  2. Louise Cervera
  3. Lou Berthomieu
  4. Jean-Baptiste Bernaz
  5. Charline Picon

These recommendations are based on shared attributes like sports category, performance, and social influence.

📌 Future Enhancements

  • Integrate deep learning for enhanced similarity detection.
  • Expand the chatbot with natural language processing (NLP) for better query understanding.
  • Add real-time athlete data updates.

Made with ❤️ for Paris 2024 Data Analysis & Recommender Systems 🏅

About

This project applies unsupervised machine learning to analyze and recommend athletes for the Paris 2024 Olympic & Paralympic Games. Using clustering and similarity-based methods, we built an athlete recommender system that identifies similar athletes based on key performance, demographic, and social influence attributes.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published