Skip to content

Machine learning system predicts English Second Division match outcomes using Logistic Regression on historical data. Features data preprocessing, feature engineering (goals, form), and visualizations (bar/pie charts). Streamlit app offers interactive predictions. Built with Python, Pandas, Scikit-learn, Matplotlib, Seaborn, Streamlit.

Notifications You must be signed in to change notification settings

SATTVIKO/Football-Match-Prediction-System

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Football Match Prediction System

Description

Machine learning system predicts English Second Division match outcomes using Logistic Regression on historical data. Features data preprocessing, feature engineering (goals, form), and visualizations (bar/pie charts). Streamlit app offers interactive predictions. Built with Python, Pandas, Scikit-learn, Matplotlib, Seaborn, Streamlit.

Features

  • Data Preprocessing: Cleans "England 2 CSV.csv" by handling missing values and filtering consistent teams.
  • Feature Engineering: Derives features like average goals (Avg_Home_Goals, Avg_Away_Goals) and recent form (Home_Form, Away_Form).
  • Prediction Model: Logistic Regression predicts match outcomes (home win, draw, away win) with probability scores.
  • Visualizations:
    • Bar charts for half-time goals, fouls, corners, and yellow/red cards.
    • Pie charts for historical win/draw/loss distributions.
  • Streamlit App: Interactive web interface for team selection, stat adjustments, and visualized predictions.
  • User-Friendly Design: Prevents same-team selections and provides intuitive outputs.

Technologies Used

  • Python 3.8+
  • Pandas & NumPy (data manipulation)
  • Scikit-learn (machine learning)
  • Matplotlib & Seaborn (visualization)
  • Streamlit (web app)
  • Pickle (model serialization)

Repository Structure

  • train_and_predict.py: Processes data, trains model, generates predictions, and creates visualizations.
  • app.py: Streamlit app for interactive predictions and visualizations.
  • football_prediction_model.pkl: Pre-trained Logistic Regression model (generate via train_and_predict.py).
  • README.md: Project documentation.
  • Note: England 2 CSV.csv is not included; users must provide their own dataset.

Prerequisites

  • Python 3.8 or higher
  • A CSV dataset (England 2 CSV.csv) with columns like HomeTeam, AwayTeam, FTH Goals, FTA Goals, HTH Goals, H Fouls, H Corners, H Yellow, H Red, etc.
  • Git (optional, for cloning)

Setup Instructions

  1. Clone the Repository:

    git clone https://github.com/SATTVIKO/football-match-prediction.git
    cd football-match-prediction
  2. Install Dependencies: Create a virtual environment (optional) and install required packages:

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
    pip install pandas numpy scikit-learn matplotlib seaborn streamlit
  3. Prepare the Dataset:

    • Place England 2 CSV.csv in the project root directory.
    • Ensure it matches the expected format (see Technologies Used for required columns).
  4. Train the Model: Run the training script to process data, train the model, and save football_prediction_model.pkl:

    python train_and_predict.py
  5. Run the Streamlit App: Launch the web application:

    streamlit run app.py

    Access it at http://localhost:8501 in your browser.

Usage

  • Training Script (train_and_predict.py):
    • Processes the dataset, trains the Logistic Regression model, and generates predictions for predefined matches (e.g., Blackburn vs. Portsmouth).
    • Outputs visualizations (bar/pie charts) for team stats and saves the model.
  • Streamlit App (app.py):
    • Open the app in your browser, select home and away teams, and adjust stats if desired.
    • Click "Predict Match" to view the predicted outcome, probability distribution, and visualizations.
    • Explore historical performance via pie charts and stat comparisons via bar charts.

Example Output

  • Prediction:
    Blackburn vs Portsmouth: Home Win (H)
    Probabilities - H: 0.5234, D: 0.2345, A: 0.2421
    
  • Visualizations: Bar charts for half-time goals, fouls, corners, cards; pie charts for win/draw/loss distributions.

Future Enhancements

  • Integrate real-time data via sports APIs (e.g., Opta).
  • Experiment with advanced models (e.g., XGBoost, neural networks).
  • Add player-specific features (e.g., top scorer stats).
  • Expand to other leagues (e.g., Premier League).
  • Deploy to a cloud platform (e.g., Heroku) for public access.

Contributing

Contributions are welcome!

Contact

For questions or feedback, reach out to sattviky@gmail.com


About

Machine learning system predicts English Second Division match outcomes using Logistic Regression on historical data. Features data preprocessing, feature engineering (goals, form), and visualizations (bar/pie charts). Streamlit app offers interactive predictions. Built with Python, Pandas, Scikit-learn, Matplotlib, Seaborn, Streamlit.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published