The Brazil Weather Data project is a purely educational initiative aimed at transforming the data from Brazilian automatic weather stations into a lightweight API. This project serves as an excellent resource for learning and experimentation with API development, data handling, and modern software engineering practices.
This API is hosted in a free tier webservice on Render. Due the resource avaliable, the data is ingested in a local machine and only upload to the repository.
You can access the documentation at GitHub.
The primary goal of this project is to provide easy and structured access to meteorological data from Brazilian weather stations. The API offers three main endpoints:
stations_data
: Provides data about the weather stations, including geographical location and date of implementation.weather_data
: Offers meteorological data collected from the stations.query
: Allows users to simulate SQL queries for customized data retrieval.
The data is sourced from the Brazilian National Institute of Meteorology (INMET - Instituto Nacional de Meteorologia), ensuring reliability and comprehensiveness.
This project is a refactored version of a Kaggle notebook focused on collecting and analyzing Brazilian weather data. The technical stack for this project includes:
- pandas for data wrangling.
- FastAPI for building the API.
- DuckDB as the database management system.
- pytest for running tests.
- Pre-commit and Commitizen to ensure code quality and standardized commit messages.
- Continuous Integration and Continuous Deployment (CI/CD) using GitHub Actions.
Initially, the database for this project was populated locally due to resource constraints. This approach ensured a solid foundation for our initial data analysis and API functionality. However, in a professional setting, the system is designed to be scalable and automated.
Ideally, the data collection pipeline would be set up to run automatically on a monthly basis. This would allow the database to be continuously updated with the latest weather data. Such a setup would not only provide real-time insights but also enrich the database over time, enhancing the depth and accuracy of our analyses.
This automated approach, combined with a more robust hosting solution, would make the project more dynamic and valuable for ongoing weather data analysis and research.
To get started with the Brazil Weather Data API, follow these steps:
-
Ensure you have pyenv and Poetry installed on your system for dependency management.
-
Ensure you have the python version 3.11.11 avaiable in your system using the command
pyenv versions
. If 3.11.11 is not listed, use the commandpyenv install 3.11.11
. -
Clone the repository from GitHub:
git clone https://github.com/gregomelo/brazil_weather_data.git
-
Navigate to the cloned directory and install the dependencies using Poetry:
cd brazil_weather_data pyenv local 3.11.11 poetry env use 3.11.11 poetry install --no-root poetry lock --no-update
To get started with the Brazil Weather Data API, follow these steps:
-
Ensure you have (Docker)[https://www.docker.com/] installed and free space (around 3GB),
-
Clone the repository from GitHub:
git clone https://github.com/gregomelo/brazil_weather_data.git
-
Run the command
docker build -t bwd-container .
. -
After completer, run the command
docker run -d -p 8000:8000 -p 8001:8001 bwd-container
. -
To use the API, open the address http://localhost:8000/. To read the documentation, open the address: http://localhost:8001/.
- Run the following commands:
poetry run task run
- Open your favorite browser and navigate to Brazil Weather Data API.
Note: if your are running other service on 8000 port, you need to edit the running port on [tool.taskipy.tasks]
section at pyproject.toml
file.
If you want to kill all other process on 8000 port, you can use the command:
poetry run task killr
- Run the following commands:
poetry run task docs
- Open your favorite browser and navigate to Brazil Weather Data API Docs.
Note: if your are running other service on 8001 port, you need to edit the running port on [tool.taskipy.tasks]
section at pyproject.toml
file.
If you want to kill all other process on 8001 port, you can use the command:
poetry run task killd
Run the following commands:
poetry run task test
Run the following commands:
poetry run task pipeline -- list_years
For example, to run the pipeline only for 2023:
poetry run task pipeline -- 2023
For more than one year, digit years with a simple-space between then:
poetry run task pipeline -- 2023 2022 2021
As an educational project, contributions are highly encouraged. Whether you're looking to fix bugs, add features, or improve documentation, your input is welcome. Please follow the standard GitHub flow for contributions.
This project is open-sourced under the MIT license.
Enjoy exploring and utilizing the Brazil Weather Data API! 🌦️🇧🇷