Water Potability Detection - MLOps Project

This project implements an end-to-end MLOps pipeline for water potability detection based on water quality parameters. The system uses machine learning to predict whether water is potable or not.

Visit at : https://ty-prct.github.io/water-potability-detection

Features

ML Pipeline: Data preprocessing, model training, and evaluation
API Server: FastAPI-based REST API for model serving
Web Interface: User-friendly web interface for testing the model
Monitoring: Real-time model monitoring for drift detection
MLOps: Continuous integration, deployment, and monitoring
Docker: Containerized deployment for portability
A/B Testing: Compare performance of different models in production
Persistent Monitoring: Robust monitoring with persistent storage of metrics

Project Structure

.
├── data/                # Training, validation, and test data (DVC tracked)
├── notebooks/           # Jupyter notebooks for data exploration and model development
├── models/              # Trained ML models (DVC tracked)
├── results/             # Evaluation results and best model artifacts
├── src/                 # Source code for scripts and pipelines
│   ├── scripts/         # Python scripts for training, evaluation, and deployment
│   ├── tests/           # Unit and integration tests
├── web/                 # Frontend web application
│   ├── static/          # Static assets (CSS, JS, images)
│   ├── templates/       # HTML templates
├── monitoring/          # Model monitoring configurations and dashboards
├── .github/             # GitHub Actions workflows for CI/CD
├── setup.sh             # Setup script for the project
├── Dockerfile           # Docker configuration for containerized deployment
├── docker-compose.yml   # Docker Compose configuration
└── README.md            # Project documentation

Getting Started

Prerequisites

Python 3.11+
Docker and Docker Compose (for containerized deployment)

Setup

Clone the repository and navigate to the project directory:

git clone https://github.com/yourusername/water-potability-detection.git
cd water-potability-detection

Run the setup script:

chmod +x setup.sh
./setup.sh

Start the application:

# Option 1: Run with Python
uvicorn src.scripts.deploy_api:app --reload

# Option 2: Run with Docker
docker-compose up --build

If you encounter Docker issues, try the troubleshooting script:

# Make the script executable
chmod +x docker_troubleshoot.sh

# Run the script
./docker_troubleshoot.sh

Common Docker issues and solutions:

Connection refused error: Docker daemon is not running. Start it with sudo systemctl start docker
Permission denied: Add your user to the docker group with sudo usermod -aG docker $USER and log out and back in
Missing dependencies: Install both Docker and Docker Compose

Access the web interface at http://localhost:8000

MLOps Workflow

Data Version Control: Data and models are tracked with DVC
Training Pipeline: src/scripts/train_pipeline.py trains multiple models
Evaluation: src/scripts/evaluate_pipeline.py evaluates and selects the best model
Deployment: FastAPI server exposes the model via REST API
Monitoring: Model performance and drift are continuously monitored
CI/CD: GitHub Actions for testing, training, and deployment

API Endpoints

GET /: Web interface
POST /api/predict: Predict water potability
GET /api/health: API health check
GET /api/metrics: Model performance metrics
GET /metrics: Prometheus metrics endpoint

Monitoring

Access the monitoring dashboards:

Prometheus: http://localhost:9090
Grafana: http://localhost:3000 (default login: admin/admin)

MLOps Features

Model Monitoring

This project includes comprehensive model monitoring capabilities:

Data Drift Detection: Automatically detect when the distribution of input data changes
Performance Metrics Tracking: Track model accuracy, latency, and other performance metrics over time
Visualization: Generate visualizations for monitoring reports
Persistent Storage: All monitoring data is stored persistently for historical analysis
Alerting: Configurable thresholds for alerting when metrics cross specified boundaries

To access monitoring dashboards:

# Run the monitoring system
python src/scripts/model_monitoring.py

Monitoring reports are stored in the monitoring/ directory, and metrics history is saved in logs/.

A/B Testing

The project supports A/B testing to compare different models in production:

Traffic Splitting: Configure the percentage of traffic to route to each model variant
Performance Comparison: Compare metrics between model variants
API Controls: Enable/disable A/B testing and adjust traffic split via API endpoints

API endpoints for A/B testing:

GET /api/ab-testing: Get current A/B testing statistics and configuration
POST /api/ab-testing/configure: Configure A/B testing settings

Example of configuring A/B testing:

# Enable A/B testing with 80/20 traffic split
curl -X POST "http://localhost:8000/api/ab-testing/configure" \
  -H "Content-Type: application/json" \
  -d '{"enabled": true, "traffic_split": {"A": 0.8, "B": 0.2}}'

Testing

The project uses Pytest for testing. To run tests:

pytest src/tests/

Tests include:

Unit tests for individual functions.
Integration tests for pipelines and APIs.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
.github/workflows		.github/workflows
monitoring/prometheus		monitoring/prometheus
notebooks		notebooks
src		src
web		web
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.pylintrc		.pylintrc
Dockerfile		Dockerfile
LICENSE		LICENSE
MLOPS_OVERVIEW.md		MLOPS_OVERVIEW.md
README.md		README.md
data.dvc		data.dvc
docker-compose.yml		docker-compose.yml
docker_troubleshoot.sh		docker_troubleshoot.sh
metadata.yaml		metadata.yaml
models.dvc		models.dvc
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
results.dvc		results.dvc
run_api.sh		run_api.sh
run_pipeline.sh		run_pipeline.sh
setup.py		setup.py
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Water Potability Detection - MLOps Project

Visit at : https://ty-prct.github.io/water-potability-detection

Features

Project Structure

Getting Started

Prerequisites

Setup

MLOps Workflow

API Endpoints

Monitoring

MLOps Features

Model Monitoring

A/B Testing

Testing

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

yashpotdar-py/water-potability-detection

Folders and files

Latest commit

History

Repository files navigation

Water Potability Detection - MLOps Project

Visit at : https://ty-prct.github.io/water-potability-detection

Features

Project Structure

Getting Started

Prerequisites

Setup

MLOps Workflow

API Endpoints

Monitoring

MLOps Features

Model Monitoring

A/B Testing

Testing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages