This is the historical olympic data analysis project for MSDS 7331. The dataset was taken from kaggle. The dataset consists of 271k rows and 15 columns full of olympic data which span the years 1896 to 2016.
The environment.yml
file holds all the dependencies for this particular project.
In order to have the project working on your machine you will need to perform the
following setup of anaconda as conda
is used to manage the python environment
for this project.
If Anaconda is not setup on your machine you can follow the directions here
If you are using a mac the following directions for set up can be found here.
Creating the olympics
environment
# cd into the olympics directory
cd <your_local_path>/olympics/
#in the terminal run
conda env create -f environment.yml
This creates the olympic
conda environment with all the dependencies listed in
environment.yml
. You can now switch to the environment by running the following..
source activate olympics
If you add a new dependency to the project make sure you update the environment.yml
file so in order to have those new updates reflect locally for you.
conda env update -f environment.yml
This olympics
environment includes jupyter notebook
as well as many of the other
packages we need in order to run the analysis on this project.
Our group has developed a proof of concept which includes an Olympic sport recommendation engine. Our system takes in a few factors from the user and gives the user an olympic sport that might be best suitable for them. Our models prediction is based on over 200 years worth of olympic athletic data. See which sport best fits you!
| Brian Coari | Stephen Merritt | Cory Thigpen | Quentin Thomas |