A Python-based tool for tracking sumo wrestling matches from the Japan Sumo Association website. This tool scrapes match data and stores it in a SQLite database for analysis and tracking.
- Scrapes match data from the Japan Sumo Association website
- Stores match information in a SQLite database
- Tracks wrestler names, divisions, winning techniques, and match results
- Prevents duplicate match entries
- Supports multiple tournament data collection
- Daily updates for current tournament matches
- Python 3.9 or higher
- pip (Python package installer)
- Clone the repository:
git clone https://github.com/suineg/SumoTracker.git
cd SumoTracker
- Create and activate a virtual environment:
On Windows:
python -m venv venv
venv\Scripts\activate
On macOS/Linux:
python3 -m venv venv
source venv/bin/activate
- Install the package:
pip install -e .
After installation, the following commands will be available in your terminal:
To collect data from recent tournaments:
sumo-scrape
To update matches for the current tournament day:
sumo-daily
For automated daily updates, you can set up a cron job (Linux/macOS) or Task Scheduler (Windows).
- Find the path to your virtual environment's Python:
which python
- Create a shell script (e.g.,
update_sumo.sh
):
#!/bin/bash
cd /path/to/SumoTracker
source venv/bin/activate
sumo-daily
deactivate
- Make the script executable:
chmod +x update_sumo.sh
- Add a cron job (runs at 19:00 local time):
crontab -e
0 19 * * * /path/to/update_sumo.sh >> /path/to/sumo_updates.log 2>&1
- Create a batch script (e.g.,
update_sumo.bat
):
@echo off
cd C:\path\to\SumoTracker
call venv\Scripts\activate
sumo-daily
deactivate
- Open Task Scheduler:
- Create a new task
- Set the trigger to daily at 7:00 PM
- Action: Start a program
- Program:
C:\path\to\update_sumo.bat
To test the scraper with a single day of matches:
sumo-test
To view stored match data and statistics:
sumo-query
By default, the SQLite database (sumo_matches.db
) is created in your current working directory. You can specify a different location by setting the SUMO_DB_PATH
environment variable:
On Windows:
set SUMO_DB_PATH=C:\path\to\your\database.db
On macOS/Linux:
export SUMO_DB_PATH=/path/to/your/database.db
The SQLite database contains a single table matches
with the following schema:
id
: Primary keytournament_id
: Tournament identifierwrestler_name
: Name of the wrestlerdivision
: Division namewinning_technique
: Technique used to win (if applicable)win_loss
: Result of the match ('win' or 'loss')match_date
: Date of the matchopponent_name
: Name of the opponent
-
Command not found: If the commands aren't available after installation:
- Make sure your virtual environment is activated
- Try reinstalling the package:
pip install -e .
- Check if your Python scripts directory is in PATH
-
Database errors:
- Ensure you have write permissions in the directory
- Try specifying a different database path using
SUMO_DB_PATH
-
Import errors:
- Make sure you've installed the package with
pip install -e .
- Verify that all dependencies were installed correctly
- Make sure you've installed the package with
If you encounter any issues:
- Check the error message and the troubleshooting section above
- Look for similar issues in the GitHub Issues
- Create a new issue with details about your problem
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
This tool is for educational purposes only. Please respect the Japan Sumo Association's website terms of service and implement appropriate rate limiting in production use.