This R script is designed to scrape league standings tables from Transfermarkt, a popular football statistics website. It extracts relevant data, processes it, and merges it with league information for further analysis. The final output is saved as a CSV file for easy access and use.
- Web Scraping: Automatically retrieves league standings data from Transfermarkt for specified seasons (e.g., 2003-2023 or 2015-2024).
- Data Processing: Cleans and organizes the scraped data for analysis.
- CSV Export: Saves the processed data into a CSV file for easy sharing and further use.
- Ensure you have the required R packages installed:
dplyr
,tidyverse
,rvest
,openxlsx
, andpurrr
. - Update the
comp_25.csv
file path in the script to match your local directory. - Specify the range of seasons you want to scrape (e.g., 2003-2023 or 2015-2025) in the script.
- Run the script to scrape, process, and export the data.
The script generates a CSV file containing the merged league standings and additional league information.
- R (version 4.0 or higher)
- RStudio (recommended)
- Required R packages:
dplyr
,tidyverse
,rvest
,openxlsx
,purrr
Contributions are welcome! If you have suggestions or improvements, please open an issue or submit a pull request.