ML Pipeline for River Flow Estimation

This repository provides a machine learning pipeline designed to estimate mean and low reference flows for Brazilian river stretches. It includes scripts for data collection, preprocessing, model training, and evaluation.

Data Generation Workflow

Step 1: Data Collection

Description: Data was collected using the Google Earth Engine Python API to extract hydrological and environmental metrics for Brazilian river stretches.
File: src/data_treatment/gee_data_extract.py

Step 2: Data Pre-processing

Description: The raw data was processed using topological information from the Brazilian Hydrography Ottocodified (BHO) to generate features. Ran in the following order:
1. Structure Flow data: src/data_treatment/org_flow.py
2. Aggregate all input data: src/data_treatment/agg_att.py
3. Aggregated attributes to catchment accumulated: src/data_treatment/acc_att.py
4. Structure All the data to be used by the ML models: src/data_treatment/to_ml.py

Step 3: Model Processing

Description: Six ML models were processed. A K-fold CV was used at the gauging sites, and the all gauging data was used for all ungauged sites, for all models.
File: src/process_modelig/model_run.py

Step 4: Post-processing

Description: The trained model was evaluated, and performance metrics were saved.
1. Evaluation of averaged ensemble combination: src/process_post/ens_eval.py
2. Processing of the best ensemble combination to all data: src/process_post/ens_run.py
3. Uncertainty estimation: src/process_post/unc_run.py

How to Use

Clone the Repository: bash git clone https://github.com/barbedorafael/ml_pipeline.git cd ml_pipeline

Install Dependencies: Install required libraries with: bash pip install -r requirements.txt

Requirements

Python 3.10+
Google Earth Engine Python API
Additional Python libraries (see requirements.txt)

Contributing

Suggestions, bug reports, and contributions are welcome! Open an issue or submit a pull request to improve the workflow.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
docs		docs
notebooks		notebooks
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ML Pipeline for River Flow Estimation

Data Generation Workflow

Step 1: Data Collection

Step 2: Data Pre-processing

Step 3: Model Processing

Step 4: Post-processing

How to Use

Requirements

Contributing

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

License

barbedorafael/ml-pipeline

Folders and files

Latest commit

History

Repository files navigation

ML Pipeline for River Flow Estimation

Data Generation Workflow

Step 1: Data Collection

Step 2: Data Pre-processing

Step 3: Model Processing

Step 4: Post-processing

How to Use

Requirements

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages