A comprehensive MLOps learning repository covering Python fundamentals to advanced machine learning operations, experiment tracking, and deployment strategies.
- π― Overview
- π οΈ Prerequisites
- π Learning Path
- π Quick Start
- π Module Breakdown
- π§ Tools & Technologies
- π‘ Projects
- π§± Additional Tech Stack
- π Resources
This repository serves as a complete learning resource for Machine Learning Operations (MLOps), starting from Python fundamentals and progressing to advanced MLOps practices. Whether you're a beginner or looking to enhance your MLOps skills, this structured curriculum will guide you through:
- Python Programming Fundamentals
- Data Analysis & Manipulation
- Machine Learning Model Development
- Experiment Tracking with MLflow
- Cloud-based MLflow on AWS Infrastructure
- AWS SageMaker for Enterprise ML
- Data Version Control with DVC
- Model Deployment & Monitoring
- Web Application Development with Flask
- Basic understanding of programming concepts
- Python 3.8 or higher installed
- Git for version control
- Jupyter Notebook/Lab environment
- Basic knowledge of machine learning concepts (helpful but not required)
graph TD
A[Python Basics] --> B[Control Flow]
B --> C[Data Structures]
C --> D[Functions & Modules]
D --> E[OOP Concepts]
E --> F[Exception Handling]
F --> G[File Operations]
G --> H[Advanced Python]
H --> I[Data Analysis]
I --> J[Database Operations]
J --> K[Logging]
K --> L[Flask Web Development]
L --> M[MLflow & Experiment Tracking]
M --> N[MLflow on AWS Cloud]
N --> O[DVC - Data Version Control]
O --> P[Dagshub Integration]
P --> Q[AWS EC2 + S3 Deployment]
Q --> R[Docker Containerization]
R --> S[Apache Airflow Workflows]
S --> T[GitHub Actions CI/CD]
T --> U[AWS Beanstalk + Azure Deployment]
U --> V[Hugging Face Integration & Fine-Tuning]
V --> W[AWS SageMaker Pipeline]
W --> X[Grafana for Monitoring]
X --> Y[Amazon Bedrock - GenAI]
-
Clone the repository:
git clone https://github.com/jagadeshchilla/MLOPS.git cd MLOPS
-
Set up virtual environment:
python -m venv venv source venv/bin/activate venv\Scripts\activate
-
Install dependencies:
pip install -r mlflow/requirements.txt
-
Start Jupyter Lab:
jupyter lab
-
Launch MLflow UI:
mlflow ui
Module | Topic | Description | Key Concepts |
---|---|---|---|
1 | Python Basics | Variables, data types, operators | Foundation concepts |
2 | Control Flow | Conditional statements, loops | Decision making & iteration |
3 | Data Structures | Lists, tuples, sets, dictionaries | Data organization |
4 | Functions | Function definition, lambda, map/filter | Code reusability |
5 | Modules | Import systems, packages | Code organization |
6 | File Handling | File I/O operations | Data persistence |
7 | Exception Handling | Error handling, custom exceptions | Robust programming |
8 | OOP | Classes, inheritance, polymorphism | Object-oriented design |
9 | Advanced Python | Iterators, generators, decorators | Advanced techniques |
- NumPy: Numerical computing and array operations
- Pandas: Data manipulation and analysis
- Matplotlib: Data visualization and plotting
- Seaborn: Statistical data visualization
- Data Processing: Reading from CSV, Excel, and various formats
- SQLite3: Database creation and management
- CRUD Operations: Create, Read, Update, Delete
- Data Integration: Connecting Python with databases
- Python Logging: Structured logging practices
- Multiple Loggers: Advanced logging configurations
- Log Management: Best practices for production systems
- Flask Framework: Web application development
- API Development: RESTful API creation
- Template Rendering: Dynamic web pages
- Static Files: CSS, JavaScript integration
- Container Orchestration: Docker-based application deployment
- Alpine Linux: Lightweight, secure container base images
- Flask Containerization: Production-ready web application containers
- DevOps Integration: CI/CD pipeline integration with Docker
- Microservices Architecture: Scalable, containerized service deployment
- DAG Development: Directed Acyclic Graph workflow creation
- Task Scheduling: Automated task execution and dependency management
- MLOps Pipeline Orchestration: End-to-end ML workflow automation
- Astro CLI Integration: Modern development and deployment tooling
- Monitoring & Observability: Comprehensive workflow monitoring and alerting
- Continuous Integration: Automated code validation and testing
- Continuous Deployment: Streamlined deployment workflows
- Automated Testing: Comprehensive test suite with pytest
- Code Quality Gates: Flake8 linting and coverage reporting
- Multi-Python Support: Matrix testing across Python 3.8, 3.9, 3.10
- Security Scanning: Dependency auditing and vulnerability detection
- Experiment Tracking: Model versioning and metrics logging
- Model Registry: Centralized model management
- Deployment: Model serving and monitoring
- Hyperparameter Tuning: Automated optimization workflows
- DVC Init & Setup: Track large data files and models
- Data Pipelines: Automate preprocessing and training steps
- Versioning: Manage dataset/model history like Git
- Remote Storage: Push/pull from S3, GDrive, or DagsHub
- End-to-End MLOps Platform: Combines Git, DVC, MLflow
- Experiment Logging: Track experiments on DagsHub via MLflow
- Collaboration: Share datasets, models, and runs
- Visualization: View lineage, metrics, and model artifacts
- AWS EC2 + S3: Host MLflow tracking server, manage artifacts
- Elastic Beanstalk: Deploy production-ready ML web apps
- Azure App Services: Cross-cloud deployment of Flask/API apps
- CodePipeline & IAM: CI/CD automation and role-based security
- Real-time Dashboarding: Monitor training logs, system metrics
- Custom Alerts: Track data drift, performance, and failures
- CloudWatch + Grafana: Unified AWS observability
- ML Model Monitoring: Visualize accuracy, latency, loss over time
Technologies i have learned so far
Amazon Web Services
Elastic Compute Cloud
Simple Storage Service
Relational Database Service
Elastic Container Registry
Elastic Beanstalk
Machine Learning Platform
Identity & Access Management
Amazon Bedrock
Microsoft Azure
I have done some projects along with learning this tech stack β it was a wonderful journey!
Each project helped reinforce practical MLOps, data science, DevOps, and cloud deployment skills.
Check out the table below to explore my work:
Project Name | Description | Tech Stack | Link |
---|---|---|---|
House Price Prediction with MLflow | ML model training and experiment tracking with MLflow | Repo | |
ANN with MLflow | Neural network model tracked and visualized with MLflow | Repo | |
DVC & MLflow Pipeline | ML pipeline with DVC and MLflow for versioning and tracking | Repo | |
NASA APOD ETL Pipeline | ETL pipeline using Apache Airflow to pull data from NASA API | Repo | |
GitHub Actions Test | Simple CI/CD pipeline using GitHub Actions | Repo | |
Dockerized Flask App | Flask app containerized using Docker | Repo | |
Wine Quality Prediction | ML model for predicting wine quality | Repo | |
Phishing Detection API | ML API to detect phishing websites | Repo | |
Student Performance (EC2) | ML deployment on AWS EC2 using Docker + CI/CD | Repo | |
Student Performance (Beanstalk) | Flask ML app deployed via AWS Elastic Beanstalk | Repo | |
Student Performance (Azure) | Flask app deployed to Azure App Service | Repo | |
Text Summarizer API | Text summarization using Transformers and Hugging Face | Repo | |
Mobile Classification (SageMaker) | Mobile price classification using AWS SageMaker | Repo |
I have learned this tech stack as part of my MLOps journey β spanning model development, experiment tracking, containerization, orchestration, automation, and deployment.
Here are some powerful tools I aim to explore and implement in upcoming projects for a complete MLOps lifecycle:
These tools will help me advance into scalable and enterprise-ready MLOps workflows using Kubernetes, event-driven pipelines, and real-time monitoring.
Here are some of the learning materials and official documentation that helped me build strong foundations in MLOps and Cloud-based ML systems:
- Python Docs
- MLflow Documentation
- DVC Docs
- Dagshub Docs
- Docker Docs
- Apache Airflow Docs
- GitHub Actions Docs
- AWS EC2 Docs
- AWS S3 Docs
- AWS SageMaker Docs
- AWS Bedrock Docs
- AWS RDS Docs
- AWS Elastic Beanstalk Docs
- Azure Docs
- Grafana Docs
- FastAPI Docs
- Hugging Face Transformers Docs
- TensorFlow Docs
- Keras Docs
- Jupyter Notebook Docs