Author: Fernando García Catalán ✍️
Github: https://github.com/fergarcat 🌐
Scrum Project: https://github.com/users/fergarcat/projects/2 📋
BME Scraper is a tool designed to scrape data from the BME (Bolsa y Mercados Españoles Stock Exchange) website.
It automates the process of extracting relevant financial data for analysis or reporting purposes.
This is my second project during the AI Bootcamp at https://github.com/Factoria-F5-dev/ 🎓
- Scrape companies and stock data from the BME website (https://www.bolsasymercados.es/). 💻
- Save data to local sqlite database. 💾
- Configurable scraping options. ⚙️
- Dump and deletion options for the database. 🗑️
- Logging information to
myapp.log
. 📝 - Verbose output option. 🔍
As described in requirements.txt
file.
- Clone the repository:
git clone https://github.com/fergarcat/bme-scraper
- Navigate to the project directory:
cd bme-scraper
- Create a virtual environment:
python -m venv .venv # For Windows python3 -m venv .venv # For MacOS/Linux
- Activate the virtual environment:
.venv/Scripts/activate # For Windows source venv/bin/activate # For MacOS/Linux
- Install dependencies:
uv pip install -r requirements.txt # For Windows pip3 install -r requirements.txt # For MacOS/Linux
- Rename file in the project directory
.env.example
to.env
and set database path and default waiting time in the file:rename .env.example .env
- Run the scraper:
python main.py --help
Arguments are divided into two groups: --scrape
and --database
, as well as general options.
--help Show options --version Shows program version --verbose Shows detailed information in time of execution --wait Set random wait time, in secods, from 1 to input value. Default 5
--scrape Starts the execution of the scrape --all Set all scrapes available --companies Set company info as scope of the scrape --stock_values Set current stock values as scope of the scrape
--delete_file Delete sqlite db file --delete_db Delete all rows from all tables. Keeps file and table structure. --dump Dumps db structure and data to dump.sql file
- Check the available options:
python main.py --help
- Run the scraper with desired options:
python main.py --scrape --all --wait 5 --verbose # Scrape all companies and current stock values with a 5-second delay between requests. Shows verbose output.
- Dump the database to a SQL file:
python main.py --database --dump # Dump the database to dump.sql file.
- Delete the database:
python main.py --database --delete_file # Delete the database file. python main.py --database --delete_db # Delete all the data, but keep the database file and table structure.
Dockerfile and docker-compose.yml files are included in the project for containerization. Cronjob is set to run the scraper every day from Monday to Friday between 8:00-20:00.
- Build the Docker image:
docker-compose up --build
- Run the Docker container:
docker ps # Check the container ID docker exec -it <container_id> bash # Access the container
- Check the logs:
cat myapp.log
- Connect to database:
sqlite3 bme.db
- Select data:
SELECT * FROM companies; SELECT * FROM stock_values;
Contributions are welcome! Please fork the repository and submit a pull request.
This project is licensed under the MIT License. See the LICENSE
file for details.