Weather Forecasting with MLOps: DVC, Airflow, and MLFlow 🌦️🚀

This repository demonstrates a complete end-to-end MLOps pipeline for building a weather forecasting model. By leveraging tools like DVC, Airflow, and MLFlow, we’ve created an automated, scalable, and reproducible workflow for collecting data, training models, and monitoring performance.

Project Overview 📖

Objective:

The primary goal of this project is to:

Collect live weather data 🌐.
Preprocess and clean the data 🧹.
Train machine learning models to predict weather conditions 🤖.
Automate workflows for scalability and reproducibility ⚙️.
Track models and experiments for easy version control 📊.

This project adopts MLOps best practices to ensure a seamless integration of machine learning into production workflows.

Features ✨

Automated Data Collection: Fetch real-time weather data using APIs.
Data Versioning: Track datasets with DVC to ensure reproducibility.
Workflow Automation: Manage and automate tasks with Airflow DAGs.
Model Versioning and Experiment Tracking: Use MLFlow for logging experiments and comparing model performance.
Scalability: Modular design for easy extension and deployment.

Architecture 🏗️

Here’s an overview of the pipeline architecture:

graph TD;
    A[Fetch Weather Data 🌦️] --> B[Preprocess Data 🧹]
    B --> C[Train ML Model 🤖]
    C --> D[Evaluate and Monitor Results 📊]

Each stage is modular and automated using Airflow, with DVC and MLFlow ensuring version control and tracking.

Getting Started 🛠️

Prerequisites

Python 3.8+
Git
DVC
Apache Airflow
MLFlow

Installation

Clone the repository:

git clone https://github.com/your-repo/weather-mlops.git
cd weather-mlops

Install dependencies:
```
pip install -r requirements.txt
```
Initialize DVC:
```
dvc init
```

Configure Airflow:

airflow db init
airflow users create --username admin --password admin --firstname Admin --lastname User --role Admin --email admin@example.com

Usage 🏃‍♀️

1. Fetch Weather Data

Use the provided Python script to fetch live weather data:

python fetch_data.py --api_key YOUR_API_KEY --city "New York"

2. Run the Airflow Pipeline

Activate the Airflow scheduler and webserver:

airflow scheduler
airflow webserver

Access the Airflow UI at http://localhost:8080, enable the DAG, and watch the pipeline run! 🎡

3. Train and Track Models

Train models while logging parameters and metrics to MLFlow:

python train_model.py

Launch the MLFlow UI to view experiment results:

mlflow ui

Access it at http://localhost:5000.

MLFlow Integration 📋

Overview:

MLFlow is integrated into the pipeline for:

Logging model parameters (e.g., learning rate, batch size).
Tracking metrics (e.g., accuracy, RMSE).
Managing model versions and artifacts.
Comparing experiment results.

Logging Model Experiments:

Here’s how the pipeline logs experiments to MLFlow:

Initialization: MLFlow is initialized with a remote tracking URI or a local directory.

import mlflow
mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment("Weather Forecasting")

Logging Parameters and Metrics: During training, parameters (e.g., hyperparameters) and metrics (e.g., validation accuracy) are logged:
```
with mlflow.start_run():
    mlflow.log_param("learning_rate", 0.001)
    mlflow.log_metric("rmse", 2.3)
```
Storing Artifacts: Model artifacts (e.g., trained model files) are saved and versioned:
```
mlflow.log_artifact("models/weather_model.pkl")
```

Model Registry: The best-performing model is promoted to production:

mlflow.register_model("models:/WeatherModel/production", "weather_model")

MLFlow UI:

The MLFlow UI provides a comprehensive interface to:

Compare experiments.
Visualize metrics.
Manage model versions.

To launch the UI:

mlflow ui

Access it at http://localhost:5000.

Key Tools 🛠️

DVC: Tracks datasets and models, ensuring reproducibility.
Airflow: Automates pipeline tasks.
MLFlow: Logs and tracks experiments, parameters, and artifacts.
Weather API: Provides live weather data.
Scikit-learn: Used for model training.

Results 📊

The pipeline outputs include:

Cleaned Datasets: Version-controlled and stored using DVC.
Trained Models: Versioned with DVC for reproducibility.
Experiment Logs: Detailed metrics, parameters, and artifacts tracked using MLFlow.
Automated Pipeline: Tasks run seamlessly using Airflow.

Contributing 🤝

Contributions are welcome! Please follow these steps:

Fork the repository.
Create a new branch:
```
git checkout -b feature-branch
```
Commit your changes:
```
git commit -m "Add feature"
```
Push to the branch:
```
git push origin feature-branch
```
Submit a pull request.

License 📜

This project is licensed under the MIT License. Feel free to use and modify it as you like. 🎉

Let’s Connect! 🌍

If you found this project helpful, feel free to:

🌟 Star the repo!

Happy coding! 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
.dvc		.dvc
.github/workflows		.github/workflows
airflow		airflow
backend		backend
frontend		frontend
k8s		k8s
mlartifacts/975151958005635935		mlartifacts/975151958005635935
mlruns		mlruns
tests		tests
.dvcignore		.dvcignore
.gitignore		.gitignore
LICENSE		LICENSE
Medium Blog - MLOps.pdf		Medium Blog - MLOps.pdf
README.md		README.md
data_collection.py		data_collection.py
docker-compose.yml		docker-compose.yml
dockerfile		dockerfile
dvc.lock		dvc.lock
dvc.yaml		dvc.yaml
mlflow_pipeline.py		mlflow_pipeline.py
model_.pkl.dvc		model_.pkl.dvc
preprocessing.py		preprocessing.py
requirements.txt		requirements.txt
train_model.py		train_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Weather Forecasting with MLOps: DVC, Airflow, and MLFlow 🌦️🚀

Table of Contents

Project Overview 📖

Objective:

Features ✨

Architecture 🏗️

Getting Started 🛠️

Prerequisites

Installation

Usage 🏃‍♀️

1. Fetch Weather Data

2. Run the Airflow Pipeline

3. Train and Track Models

MLFlow Integration 📋

Overview:

Logging Model Experiments:

MLFlow UI:

Key Tools 🛠️

Results 📊

Contributing 🤝

License 📜

Let’s Connect! 🌍

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

hisanusman/Weather-Forecasting-with-MLOps-DVC-Airflow-and-MLFlow

Folders and files

Latest commit

History

Repository files navigation

Weather Forecasting with MLOps: DVC, Airflow, and MLFlow 🌦️🚀

Table of Contents

Project Overview 📖

Objective:

Features ✨

Architecture 🏗️

Getting Started 🛠️

Prerequisites

Installation

Usage 🏃‍♀️

1. Fetch Weather Data

2. Run the Airflow Pipeline

3. Train and Track Models

MLFlow Integration 📋

Overview:

Logging Model Experiments:

MLFlow UI:

Key Tools 🛠️

Results 📊

Contributing 🤝

License 📜

Let’s Connect! 🌍

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages