Update README.md

2003HARSH · web-flow · commit cc6f933b4c47 · 2024-11-22T10:03:56.000+05:30
diff --git a/README.md b/README.md
@@ -1,138 +1,132 @@
 # Text Classification using MLOps
 
-This project demonstrates a complete MLOps pipeline for a text classification task, implementing end-to-end practices for model experimentation, tracking, and deployment. The project includes data tracking, model training, hyperparameter tuning, model registration, and automated pipeline deployment.
+This project demonstrates a complete MLOps pipeline for a text classification task, implementing end-to-end practices for model experimentation, tracking, packaging, and deployment. The project incorporates advanced features such as AWS CodeDeploy for automated blue-green deployment and AWS Elastic Container Registry (ECR) for model storage. To ensure scalability, reliability, and fault tolerance, it also utilizes **AWS Auto Scaling Groups (ASGs)**, **Load Balancers**, and **Launch Templates**.
+
+---
 
 ## Project Overview
 
 This repository includes:
-- **Experiment Tracking**: Tracks all model training runs with parameters, metrics, and artifacts in MLflow.
-- **Hyperparameter Tuning**: Performs hyperparameter tuning using MLflow to log and compare performance across runs.
-- **ML Pipeline with DVC**: Constructs a machine learning pipeline using DVC, tracking each stage to reproduce the best model.
-- **Model Registration**: Registers the best model on MLflow, making it accessible for production.
-- **Data Tracking**: Uses DVC to version and store the dataset in an Amazon S3 bucket.
-- **Remote Tracking Server**: Deploys an MLflow remote tracking server on DagsHub for centralized experiment tracking.
-- **Automated Pipeline Development**: Configures GitHub Actions to automate the ML pipeline, testing, and deployment processes.
-- **Unit Testing**: Adds unit test cases for the Flask API, model loading, and model signature to ensure robustness.
-- **Production Deployment**: Automatically promotes models to the production stage in MLflow if they pass all tests.
+- **Experiment Tracking**: Logs all training runs with parameters, metrics, and artifacts in MLflow.
+- **Hyperparameter Tuning**: Uses MLflow to log and compare performance during hyperparameter optimization.
+- **ML Pipeline with DVC**: Structures and manages machine learning pipelines, ensuring reproducibility.
+- **Model Registration**: Registers the best-performing models for deployment using MLflow.
+- **Data Versioning**: Tracks and versions datasets with DVC, storing them in Amazon S3.
+- **Remote Experiment Tracking**: Hosts a centralized MLflow tracking server on DagsHub.
+- **Automated CI/CD Pipelines**: Leverages GitHub Actions to automate testing, pipeline execution, and deployment.
+- **Unit Testing**: Validates API endpoints, model loading, and configurations to ensure robust deployments.
+- **AWS CodeDeploy with Blue-Green Deployment**: Deploys the application using AWS CodeDeploy to minimize downtime.
+- **AWS ECR Integration**: Stores and retrieves Docker images for deployment.
+- **Production Deployment**: Automates testing and model promotion to production, ensuring deployment readiness.
+- **Scalability Features**:
+  - **Auto Scaling Groups (ASGs)**: Automatically adjusts the number of EC2 instances based on traffic and system load.
+  - **Load Balancers**: Distributes traffic evenly across instances to ensure high availability and fault tolerance.
+  - **Launch Templates**: Defines instance configurations for easy scaling and reproducibility.
 
 ---
 
 ## Key Features
 
-### 1. Experiment Tracking with MLflow
-   - All experiments, including hyperparameters and metrics, are logged to an MLflow server hosted on DagsHub.
-   - This helps in tracking and comparing various model runs for better reproducibility and performance analysis.
-
-### 2. Hyperparameter Tuning
-   - Used MLflow’s tracking capabilities to monitor hyperparameter tuning experiments.
-   - Selected the best model configuration based on tracked metrics and stored it in the MLflow Model Registry.
-
-### 3. ML Pipeline Creation with DVC
-   - DVC (Data Version Control) is used to define and manage a structured ML pipeline.
-   - Each stage, from data preprocessing to model training, is defined in DVC, ensuring reproducibility.
-   - The final model is selected based on the best hyperparameter configuration and added to the DVC pipeline.
-
-### 4. Model Registration on MLflow
-   - The best-performing model from the experiments is registered in MLflow, ensuring easy access for deployment.
-   - Staging and production environments are defined to manage the model lifecycle effectively.
-
-### 5. Data Tracking with DVC and S3
-   - The dataset is versioned using DVC, and files are stored in an Amazon S3 bucket.
-   - This approach provides efficient data management, enabling version control and rollback if needed.
+### 1. **Experiment Tracking with MLflow**
+   - Logs all experiments, hyperparameters, metrics, and artifacts to an MLflow server hosted on DagsHub.
+   - Simplifies comparison and selection of the best-performing models.
+
+### 2. **Hyperparameter Tuning**
+   - Uses MLflow’s tracking capabilities for hyperparameter tuning.
+   - Tracks each experiment run and selects the best configuration for deployment.
+
+### 3. **Structured ML Pipeline with DVC**
+   - Employs DVC to define and manage an end-to-end ML pipeline from data ingestion to model training.
+   - Tracks all pipeline stages, ensuring reproducibility and efficient updates.
+
+### 4. **Model Registration in MLflow**
+   - Registers the best-performing models to MLflow Model Registry.
+   - Supports staging and production environments for model lifecycle management.
+
+### 5. **Data Versioning with DVC and S3**
+   - Tracks data changes and stores datasets securely in an Amazon S3 bucket.
+   - Allows easy rollback and version comparison for datasets.
+
+### 6. **Scalable Deployment with AWS**
+   - **Auto Scaling Groups (ASGs)**:
+     - Dynamically adjusts the number of EC2 instances based on predefined scaling policies (e.g., CPU usage, memory usage).
+     - Ensures cost efficiency by scaling down during low traffic and scaling up during peak traffic.
+   - **Load Balancers**:
+     - Elastic Load Balancer (ELB) ensures that incoming traffic is evenly distributed across all running instances.
+     - Provides fault tolerance by automatically routing traffic away from unhealthy instances.
+   - **Launch Templates**:
+     - Predefined configurations for EC2 instances, including AMIs, instance types, security groups, and networking settings.
+     - Simplifies instance management and ensures consistency across scaling operations.
+
+### 7. **Automated Deployment with AWS CodeDeploy**
+   - Implements blue-green deployment for seamless updates to Auto Scaling Groups (ASGs) behind a load balancer.
+   - Ensures minimal downtime and safe transitions between application versions.
+
+### 8. **Integration with AWS ECR**
+   - After passing all tests, Docker images are built and pushed to AWS Elastic Container Registry (ECR).
+   - CodeDeploy pulls these images for deployment to ASGs.
+
+### 9. **CI/CD with GitHub Actions**
+   - Automates the workflow for testing, building, and deploying updates.
+   - Triggers deployment only after passing all unit tests and validations.
+
+### 10. **Unit Testing**
+   - Comprehensive unit tests for:
+     - Flask API endpoints
+     - Model loading
+     - Model signature validation
+   - Ensures reliability before deployment.
 
-### 6. MLflow Remote Tracking Server on DagsHub
-   - MLflow tracking server is deployed on DagsHub, allowing centralized and remote tracking for experiments.
-   - Makes it easier to collaborate, monitor, and share experiment results.
+---
 
-### 7. Automated Pipeline Development with GitHub Actions
-   - GitHub Actions is configured to automate the pipeline development and testing workflows.
-   - Each update to the pipeline or model triggers a build-and-test process, ensuring continuous integration and delivery.
+## Deployment Workflow
 
-### 8. Unit Testing
-   - Unit tests are added for:
-     - Flask API endpoints
-     - Model loading function
-     - Model signature validation
-   - Ensures that each component behaves as expected before deployment.
+1. **Build and Push to AWS ECR**:
+   - After successful testing, the application is containerized using Docker.
+   - The Docker image is pushed to AWS ECR for centralized storage.
 
-### 9. Production Deployment
-   - If the model passes all test cases, it is automatically pushed to the production stage in MLflow.
-   - This step automates the final promotion to production, reducing manual intervention.
+2. **Automated Deployment**:
+   - AWS CodeDeploy retrieves the Docker image from ECR.
+   - Deploys the application to ASGs using blue-green deployment to minimize downtime.
+   - Load balancers ensure high availability, routing traffic only to healthy instances.
 
----
+3. **Scaling and Traffic Management**:
+   - ASGs adjust the number of instances based on traffic patterns.
+   - Load balancers handle incoming requests and distribute them to available instances, ensuring optimal performance.
 
-## Project Structure
-
-```plaintext
-    ├── LICENSE
-    ├── Makefile           <- Makefile with commands like `make data` or `make train`
-    ├── README.md          <- The top-level README for developers using this project.
-    ├── data
-    │   ├── external       <- Data from third party sources.
-    │   ├── interim        <- Intermediate data that has been transformed.
-    │   ├── processed      <- The final, canonical data sets for modeling.
-    │   └── raw            <- The original, immutable data dump.
-    │
-    ├── docs               <- A default Sphinx project; see sphinx-doc.org for details
-    │
-    ├── models             <- Trained and serialized models, model predictions, or model summaries
-    │
-    ├── notebooks          <- Jupyter notebooks. Naming convention is a number (for ordering),
-    │                         the creator's initials, and a short `-` delimited description, e.g.
-    │                         `1.0-jqp-initial-data-exploration`.
-    │
-    ├── references         <- Data dictionaries, manuals, and all other explanatory materials.
-    │
-    ├── reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
-    │   └── figures        <- Generated graphics and figures to be used in reporting
-    │
-    ├── requirements.txt   <- The requirements file for reproducing the analysis environment, e.g.
-    │                         generated with `pip freeze > requirements.txt`
-    │
-    ├── setup.py           <- makes project pip installable (pip install -e .) so src can be imported
-    ├── src                <- Source code for use in this project.
-    │   ├── __init__.py    <- Makes src a Python module
-    │   │
-    │   ├── data           <- Scripts to download or generate data
-    │   │   └── make_dataset.py
-    │   │
-    │   ├── features       <- Scripts to turn raw data into features for modeling
-    │   │   └── build_features.py
-    │   │
-    │   ├── models         <- Scripts to train models and then use trained models to make
-    │   │   │                 predictions
-    │   │   ├── predict_model.py
-    │   │   └── train_model.py
-    │   │
-    │   └── visualization  <- Scripts to create exploratory and results oriented visualizations
-    │       └── visualize.py
-    │
-    └── tox.ini            <- tox file with settings for running tox; see tox.readthedocs.io
-```
+4. **Continuous Integration/Delivery**:
+   - GitHub Actions automatically trigger the deployment pipeline on new commits.
 
 ---
 
 ## Setup
 
 1. **Clone the Repository**:
    ```bash
-   git clone [https://github.com/username/text-classification-mlops.git](https://github.com/2003HARSH/Text-Classification-using-MLOps.git)
-   cd Text-Classification-using-MLOps.git
+   git clone https://github.com/2003HARSH/Text-Classification-using-MLOps.git
+   cd Text-Classification-using-MLOps
    ```
 
-2. **Install Requirements**:
+2. **Install Dependencies**:
    ```bash
    pip install -r requirements.txt
    ```
 
-3. **Set Up MLflow Remote Server** (DagsHub):
-   - Follow DagsHub instructions to connect your repository and configure the MLflow server.
+3. **Set Up AWS Services**:
+   - **Auto Scaling Groups (ASGs)**: Define scaling policies for EC2 instances.
+   - **Load Balancers**: Configure ELB to distribute traffic across instances.
+   - **Launch Templates**: Create templates for consistent instance configurations.
 
-4. **Configure DVC with S3**:
-   - Set up DVC with S3 to store and version datasets:
-     ```bash
-     dvc remote add -d s3remote s3://your-bucket-name/path
-     ```
+4. **Configure AWS CodeDeploy**:
+   - Set up a CodeDeploy application with blue-green deployment using ASGs and a load balancer.
+
+5. **Push Docker Image to ECR**:
+   ```bash
+   aws ecr get-login-password --region <region> | docker login --username AWS --password-stdin <account_id>.dkr.ecr.<region>.amazonaws.com
+   docker build -t text-classification .
+   docker tag text-classification:latest <account_id>.dkr.ecr.<region>.amazonaws.com/text-classification:latest
+   docker push <account_id>.dkr.ecr.<region>.amazonaws.com/text-classification:latest
+   ```
 
 ---
 
@@ -162,38 +156,27 @@ This repository includes:
 
 ---
 
-## CI/CD with GitHub Actions
-
-GitHub Actions automates the following:
-
-- Pipeline execution and testing
-- Data tracking and artifact logging
-- Model deployment to production in MLflow
-
-**Workflow files** are stored in `.github/workflows`, where each push triggers the CI/CD pipeline.
-
----
-
 ## Testing
 
 - Run unit tests locally:
   ```bash
   python -m unittest <test_file_name>.py
   ```
-- GitHub Actions runs these tests automatically to ensure quality before production deployment.
+- CI/CD workflows execute these tests automatically.
 
 ---
 
-## License
-
-This project is licensed under the MIT License.
+## Future Enhancements
 
----
-
-## Acknowledgments
+1. **Enhanced Deployment**:
+   - Deployment of the application using AWS Elastic Container Service (ECS) for scaling and fault tolerance.
+   - Further integration with AWS CodePipeline to orchestrate the end-to-end deployment process.
 
-Thanks to [DagsHub](https://dagshub.com/) for remote MLflow tracking support and [Amazon S3](https://aws.amazon.com/s3/) for data storage.
+2. **Model Monitoring**:
+   - Integration of tools for monitoring model performance in production and detecting drift.
 
 ---
 
-Feel free to reach out for collaboration or to report issues!
+## Contact
+
+Feel free to reach out at [harshnkgupta@gmail.com](harshnkgupta@gmail.com) or create an issue in the repository for questions or collaboration opportunities!