This project is a Streamlit web application designed to analyze historical Olympic Games data from Athens 1896 to Rio 2016. The dataset includes details about athletes, events, and medals, offering insights into how the Olympic Games have evolved over time. The web app is interactive and allows users to explore different aspects of the data, including medal tallies, country-wise performances, and athlete achievements.
- Medal Tally: View the medal tally for specific years and countries.
- Overall Analysis: Examine key statistics such as the number of editions, participating nations, athletes, events, and more.
- Country-wise Analysis: Explore how specific countries have performed in terms of medals over the years.
- Athlete-wise Analysis: View the top-performing athletes by sport and medals won.
- Frontend: Streamlit for the user interface.
- Backend: Python, Pandas for data processing.
- Data Visualization: Seaborn and Plotly for generating insightful visualizations.
- Deployment: The application is deployed on Heroku for easy access.
The dataset used in this project is historical data on the Olympic Games, scraped from Sports Reference in May 2018. It includes 271,116 rows and 15 columns with details about athletes, their performance, and events they participated in. The columns are:
ID
: Unique number for each athleteName
: Athlete's nameSex
: M or FAge
: IntegerHeight
: In centimetersWeight
: In kilogramsTeam
: Team nameNOC
: National Olympic Committee 3-letter codeGames
: Year and seasonYear
: IntegerSeason
: Summer or WinterCity
: Host citySport
: SportEvent
: EventMedal
: Gold, Silver, Bronze, or NA
To run the project locally, follow these steps:
- Python 3.x
- Streamlit
- Pandas
- Plotly
- Seaborn
- Matplotlib
- Clone the repository.
- Install dependencies.
- Run the application.
- Access the app via the local Streamlit link.
The project consists of the following files:
app.py
: Main Streamlit apphelper.py
: Helper functions for data processing and visualizationpreprocessor.py
: Data cleaning and preprocessingathlete_events.csv
: Dataset with Olympic athlete datanoc_regions.csv
: National Olympic Committee regionsREADME.md
: This file
- Data scraped from Sports Reference.
- Thanks to the Olympic history enthusiasts and 'statistorians' for their incredible research.
This project is licensed under the MIT License.