Football Web Scraper

This project is a Python-based web scraper designed to collect detailed football match events data from websites like WhoScored. The scraper automates the process of extracting key football match statistics and stores the data into a Supabase PostgreSQL database for further analysis and visualization.

Features

Automated Scraping: Scrapes football match events, including xG, possession, player, and team statistics.
Data Handling: Utilizes BeautifulSoup for HTML parsing and Selenium for navigating dynamic content on the web.
Supabase Integration: Data is automatically stored in a PostgreSQL database hosted on Supabase, making it easily accessible for further analysis.
Efficient Data Storage: Uses Pandas and Pydantic models to structure, validate, and store match events.

Technologies Used

Python: Core language for writing the scraping scripts.
BeautifulSoup: Used for parsing HTML content from the websites.
Selenium: Automates the browser to interact with dynamic web pages and handle JavaScript content.
Pandas: For data structuring and manipulation.
Pydantic: Used to ensure data models are validated before storage.
Supabase: Cloud-based PostgreSQL database for data storage and retrieval.

How to Run the Project

Clone the repository:
```
git clone <repository-url>
```
Install the required Python libraries:
```
pip install -r requirements.txt
```

Ensure you have a valid .env file for your Supabase credentials:

project_url=<your-supabase-url>
project_api=<your-supabase-api-key>
supabase_password=<your-supabase-password>

Run the script to start scraping match events:
```
python scraper.py
```

Credits

Data sourced from WhoScored and stored in Supabase.

Feel free to contribute or raise issues for improvements.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.ipynb_checkpoints		.ipynb_checkpoints
.gitignore		.gitignore
Football Web Scraper.ipynb		Football Web Scraper.ipynb
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Football Web Scraper

Features

Technologies Used

How to Run the Project

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

duytran27/Football-Data-Web-Scraper

Folders and files

Latest commit

History

Repository files navigation

Football Web Scraper

Features

Technologies Used

How to Run the Project

Credits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages