Skip to content

jagadeshchilla/MLOPS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

29 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ MLOps Learning Journey

Python MLflow Jupyter Flask Pandas NumPy Scikit-learn DVC

Docker Apache Airflow Dagshub MySQL

AWS EC2 S3 RDS ECR Elastic Beanstalk CodePipeline Bedrock SageMaker

Azure

GitHub Actions CI/CD

A comprehensive MLOps learning repository covering Python fundamentals to advanced machine learning operations, experiment tracking, and deployment strategies.

πŸ“‹ Table of Contents

🎯 Overview

This repository serves as a complete learning resource for Machine Learning Operations (MLOps), starting from Python fundamentals and progressing to advanced MLOps practices. Whether you're a beginner or looking to enhance your MLOps skills, this structured curriculum will guide you through:

  • Python Programming Fundamentals
  • Data Analysis & Manipulation
  • Machine Learning Model Development
  • Experiment Tracking with MLflow
  • Cloud-based MLflow on AWS Infrastructure
  • AWS SageMaker for Enterprise ML
  • Data Version Control with DVC
  • Model Deployment & Monitoring
  • Web Application Development with Flask

πŸ› οΈ Prerequisites

  • Basic understanding of programming concepts
  • Python 3.8 or higher installed
  • Git for version control
  • Jupyter Notebook/Lab environment
  • Basic knowledge of machine learning concepts (helpful but not required)

πŸ“š Learning Path

πŸ“š Updated Learning Path

graph TD
    A[Python Basics] --> B[Control Flow]
    B --> C[Data Structures]
    C --> D[Functions & Modules]
    D --> E[OOP Concepts]
    E --> F[Exception Handling]
    F --> G[File Operations]
    G --> H[Advanced Python]
    H --> I[Data Analysis]
    I --> J[Database Operations]
    J --> K[Logging]
    K --> L[Flask Web Development]
    L --> M[MLflow & Experiment Tracking]
    M --> N[MLflow on AWS Cloud]
    N --> O[DVC - Data Version Control]
    O --> P[Dagshub Integration]
    P --> Q[AWS EC2 + S3 Deployment]
    Q --> R[Docker Containerization]
    R --> S[Apache Airflow Workflows]
    S --> T[GitHub Actions CI/CD]
    T --> U[AWS Beanstalk + Azure Deployment]
    U --> V[Hugging Face Integration & Fine-Tuning]
    V --> W[AWS SageMaker Pipeline]
    W --> X[Grafana for Monitoring]
    X --> Y[Amazon Bedrock - GenAI]
Loading

πŸš€ Updated Quick Start

  1. Clone the repository:

    git clone https://github.com/jagadeshchilla/MLOPS.git
    cd MLOPS
  2. Set up virtual environment:

    python -m venv venv
    source venv/bin/activate  
    venv\Scripts\activate
  3. Install dependencies:

    pip install -r mlflow/requirements.txt
  4. Start Jupyter Lab:

    jupyter lab
  5. Launch MLflow UI:

    mlflow ui

πŸ“– Module Breakdown

🐍 Python Fundamentals (Modules 1-9)

Module Topic Description Key Concepts
1 Python Basics Variables, data types, operators Foundation concepts
2 Control Flow Conditional statements, loops Decision making & iteration
3 Data Structures Lists, tuples, sets, dictionaries Data organization
4 Functions Function definition, lambda, map/filter Code reusability
5 Modules Import systems, packages Code organization
6 File Handling File I/O operations Data persistence
7 Exception Handling Error handling, custom exceptions Robust programming
8 OOP Classes, inheritance, polymorphism Object-oriented design
9 Advanced Python Iterators, generators, decorators Advanced techniques

πŸ“Š Data Science & Analysis (Module 10)

  • NumPy: Numerical computing and array operations
  • Pandas: Data manipulation and analysis
  • Matplotlib: Data visualization and plotting
  • Seaborn: Statistical data visualization
  • Data Processing: Reading from CSV, Excel, and various formats

πŸ—„οΈ Database Operations (Module 11)

  • SQLite3: Database creation and management
  • CRUD Operations: Create, Read, Update, Delete
  • Data Integration: Connecting Python with databases

πŸ“ Logging & Monitoring (Module 12)

  • Python Logging: Structured logging practices
  • Multiple Loggers: Advanced logging configurations
  • Log Management: Best practices for production systems

🌐 Web Development (Module 13)

  • Flask Framework: Web application development
  • API Development: RESTful API creation
  • Template Rendering: Dynamic web pages
  • Static Files: CSS, JavaScript integration

🐳 Docker Containerization

  • Container Orchestration: Docker-based application deployment
  • Alpine Linux: Lightweight, secure container base images
  • Flask Containerization: Production-ready web application containers
  • DevOps Integration: CI/CD pipeline integration with Docker
  • Microservices Architecture: Scalable, containerized service deployment

🌊 Apache Airflow & Workflow Orchestration

  • DAG Development: Directed Acyclic Graph workflow creation
  • Task Scheduling: Automated task execution and dependency management
  • MLOps Pipeline Orchestration: End-to-end ML workflow automation
  • Astro CLI Integration: Modern development and deployment tooling
  • Monitoring & Observability: Comprehensive workflow monitoring and alerting

πŸš€ GitHub Actions CI/CD

  • Continuous Integration: Automated code validation and testing
  • Continuous Deployment: Streamlined deployment workflows
  • Automated Testing: Comprehensive test suite with pytest
  • Code Quality Gates: Flake8 linting and coverage reporting
  • Multi-Python Support: Matrix testing across Python 3.8, 3.9, 3.10
  • Security Scanning: Dependency auditing and vulnerability detection

πŸ”¬ MLflow & Experiment Tracking

  • Experiment Tracking: Model versioning and metrics logging
  • Model Registry: Centralized model management
  • Deployment: Model serving and monitoring
  • Hyperparameter Tuning: Automated optimization workflows

πŸ“‚ Data Version Control (DVC)

  • DVC Init & Setup: Track large data files and models
  • Data Pipelines: Automate preprocessing and training steps
  • Versioning: Manage dataset/model history like Git
  • Remote Storage: Push/pull from S3, GDrive, or DagsHub

🌐 DagsHub Integration

  • End-to-End MLOps Platform: Combines Git, DVC, MLflow
  • Experiment Logging: Track experiments on DagsHub via MLflow
  • Collaboration: Share datasets, models, and runs
  • Visualization: View lineage, metrics, and model artifacts

☁️ Cloud Deployments (AWS & Azure)

  • AWS EC2 + S3: Host MLflow tracking server, manage artifacts
  • Elastic Beanstalk: Deploy production-ready ML web apps
  • Azure App Services: Cross-cloud deployment of Flask/API apps
  • CodePipeline & IAM: CI/CD automation and role-based security

πŸ“Š Grafana Monitoring Integration

  • Real-time Dashboarding: Monitor training logs, system metrics
  • Custom Alerts: Track data drift, performance, and failures
  • CloudWatch + Grafana: Unified AWS observability
  • ML Model Monitoring: Visualize accuracy, latency, loss over time

πŸ”§ Tools & Technologies

Technologies i have learned so far

Core Technologies

  • Python Python 3.8+
  • Jupyter Jupyter Notebooks
  • Git Version Control

Data Science Stack

  • NumPy NumPy
  • Pandas Pandas
  • Matplotlib Matplotlib
  • Seaborn Seaborn

Machine Learning & MLOps

  • Scikit-learn Scikit-learn
  • MLflow MLflow
  • DVC DVC
  • Dagshub DagsHub
  • TensorFlow TensorFlow
  • Keras Keras
  • Hugging Face Hugging Face

Web Development

  • Flask Flask
  • HTML5 HTML5
  • CSS3 CSS3

Containerization & DevOps

  • Docker Docker
  • Alpine Linux Alpine Linux

Workflow Orchestration

  • Apache Airflow Apache Airflow
  • Astro CLI Astro CLI

CI/CD & Automation

  • GitHub Actions GitHub Actions
  • CodePipeline AWS CodePipeline
  • Pytest Pytest
  • Code Quality Automated Quality Gates

Cloud Infrastructure

  • AWS Amazon Web Services
  • EC2 Elastic Compute Cloud
  • S3 Simple Storage Service
  • RDS Relational Database Service
  • ECR Elastic Container Registry
  • Elastic Beanstalk Elastic Beanstalk
  • SageMaker Machine Learning Platform
  • IAM Identity & Access Management
  • Bedrock Amazon Bedrock
  • Azure Microsoft Azure

Databases & Monitoring

  • MySQL MySQL
  • Grafana Grafana

πŸ’‘ Projects

I have done some projects along with learning this tech stack β€” it was a wonderful journey!
Each project helped reinforce practical MLOps, data science, DevOps, and cloud deployment skills.
Check out the table below to explore my work:

Project Name Description Tech Stack Link
House Price Prediction with MLflow ML model training and experiment tracking with MLflow Python MLflow scikit-learn Jupyter Repo
ANN with MLflow Neural network model tracked and visualized with MLflow Python TensorFlow MLflow Keras Hyperopt Repo
DVC & MLflow Pipeline ML pipeline with DVC and MLflow for versioning and tracking Python DVC MLflow scikit-learn DagsHub Repo
NASA APOD ETL Pipeline ETL pipeline using Apache Airflow to pull data from NASA API Apache Airflow PostgreSQL Docker AWS RDS Python Repo
GitHub Actions Test Simple CI/CD pipeline using GitHub Actions GitHub Actions Python Pytest Repo
Dockerized Flask App Flask app containerized using Docker Python Docker GitHub GitHub Actions Repo
Wine Quality Prediction ML model for predicting wine quality Python MLflow Flask Docker scikit-learn DagsHub Repo
Phishing Detection API ML API to detect phishing websites Python FastAPI Docker AWS MongoDB Repo
Student Performance (EC2) ML deployment on AWS EC2 using Docker + CI/CD Python AWS EC2 ECR CI/CD Repo
Student Performance (Beanstalk) Flask ML app deployed via AWS Elastic Beanstalk Python Elastic Beanstalk CI/CD Docker Repo
Student Performance (Azure) Flask app deployed to Azure App Service Python Azure Docker CI/CD Repo
Text Summarizer API Text summarization using Transformers and Hugging Face Python FastAPI Transformers Hugging Face Repo
Mobile Classification (SageMaker) Mobile price classification using AWS SageMaker Python SageMaker scikit-learn Jupyter Repo

🧱 Additional Tech Stack

I have learned this tech stack as part of my MLOps journey β€” spanning model development, experiment tracking, containerization, orchestration, automation, and deployment.

βœ… Learned MLOps Tools & Libraries

Python MLflow DVC Docker Apache Airflow Dagshub GitHub Actions AWS EC2 AWS S3 Elastic Beanstalk Azure SageMaker RDS ECR Bedrock Grafana FastAPI Keras TensorFlow Hugging Face Transformers Jupyter


πŸš€ Other MLOps Tools I'm Planning to Explore

Here are some powerful tools I aim to explore and implement in upcoming projects for a complete MLOps lifecycle:

Kubeflow MLRun Weights & Biases ZenML Metaflow ClearML KServe Airbyte Prefect MLJar

These tools will help me advance into scalable and enterprise-ready MLOps workflows using Kubernetes, event-driven pipelines, and real-time monitoring.

πŸ“˜ Resources

Here are some of the learning materials and official documentation that helped me build strong foundations in MLOps and Cloud-based ML systems:

πŸŽ“ Courses

πŸ“š Official Documentation


Happy Learning! πŸš€

"The journey of a thousand models begins with a single commit."

Made with ❀️