The IPL Win Predictor is a machine learning model that predicts the probability of a team winning an Indian Premier League (IPL) match based on current match statistics. This project includes a logistic regression model implemented in Python and an interactive web application built with Streamlit, as well as containerized deployment on AWS ECR and SageMaker.
- Overview
- Installation
- Deployment
- Usage
- Files
- Model Details
- Streamlit Application
- Results
- Requirements
- License
The IPL Win Predictor leverages historical match data to train a logistic regression model that provides real-time win probabilities based on the current state of the game. The model can be accessed through a Streamlit web app, allowing users to interactively input match details and receive win predictions.
To run this project locally, follow these steps:
-
Clone the repository:
git clone https://github.com/shivangsingh26/IPL-WIN-PREDICTOR.git cd IPL-WIN-PREDICTOR
-
Install the required packages:
pip install -r requirements.txt
-
Download the dataset:
-
Place
matches.csv
anddeliveries.csv
in the data directory. -
The dataset can be found here.
-
-
Run the app locally or use run the docker container from AWS ECR:
`streamlit run app.py`
This project is deployed using Docker containers on AWS ECR and AWS SageMaker.
-
Build Docker Image:
docker build -t registry_ipl_win_pred .
-
Push Image to AWS ECR:
Make sure your ECR repository is set up (e.g., registry_ipl_win_pred), then push your Docker image:
docker tag registry_ipl_win_pred:latest 992382843941.dkr.ecr.us-east-1.amazonaws.com/registry_ipl_win_pred:latest docker push 992382843941.dkr.ecr.us-east-1.amazonaws.com/registry_ipl_win_pred:latest
-
Create a SageMaker Notebook Instance and open a terminal.
-
Execute Deployment Code in the notebook to pull the Docker image from ECR and deploy it using a SageMaker model and endpoint.
-
Access Endpoint: Once the deployment is successful, the endpoint can be accessed for predictions.
To use the web application:
-
Select the
batting
andbowling
teams. -
Select the
host
city. -
Enter the
target
score. -
Enter the
current score
,overs completed
, andwickets out
. -
Click on the
Predict Probability
button to get the win probabilities for both teams.
-
notebook.ipynb
: Jupyter notebook containing the data analysis and model training code. -
app.py
: Streamlit app script for the interactive web interface. -
data/matches.csv
: Historical match data. -
data/deliveries.csv
: Ball-by-ball delivery data.
The model training includes:
- Loading and merging
match
anddelivery
data. - Feature engineering to create useful features like
current_score
,runs_left
,balls_left
,wickets
, etc. - Using
logistic regression
to predict the win probability.
The main steps in the notebook.ipynb include:
Loading
andpreprocessing data
.Feature engineering
.Model training and evaluation
.Saving the trained model using pickle
.
- The
app.py
script loads the trained model and provides an interactive UI for users to input match details and get win probabilities. - The application is built using
Streamlit
, a powerful framework for creating web applications inPython.
- The model predicts win probabilities based on the current state of the match.
- The
Streamlit
app displays the win probabilities for both the batting and bowling teams.
-
streamlit
-
pandas
-
numpy
-
scikit-learn
-
matplotlib
- This project is licensed under the MIT License. See the LICENSE file for more details.