Skip to content

Commit cc6f933

Browse files
authored
Update README.md
1 parent dc12610 commit cc6f933

File tree

1 file changed

+108
-125
lines changed

1 file changed

+108
-125
lines changed

README.md

Lines changed: 108 additions & 125 deletions
Original file line numberDiff line numberDiff line change
@@ -1,138 +1,132 @@
11
# Text Classification using MLOps
22

3-
This project demonstrates a complete MLOps pipeline for a text classification task, implementing end-to-end practices for model experimentation, tracking, and deployment. The project includes data tracking, model training, hyperparameter tuning, model registration, and automated pipeline deployment.
3+
This project demonstrates a complete MLOps pipeline for a text classification task, implementing end-to-end practices for model experimentation, tracking, packaging, and deployment. The project incorporates advanced features such as AWS CodeDeploy for automated blue-green deployment and AWS Elastic Container Registry (ECR) for model storage. To ensure scalability, reliability, and fault tolerance, it also utilizes **AWS Auto Scaling Groups (ASGs)**, **Load Balancers**, and **Launch Templates**.
4+
5+
---
46

57
## Project Overview
68

79
This repository includes:
8-
- **Experiment Tracking**: Tracks all model training runs with parameters, metrics, and artifacts in MLflow.
9-
- **Hyperparameter Tuning**: Performs hyperparameter tuning using MLflow to log and compare performance across runs.
10-
- **ML Pipeline with DVC**: Constructs a machine learning pipeline using DVC, tracking each stage to reproduce the best model.
11-
- **Model Registration**: Registers the best model on MLflow, making it accessible for production.
12-
- **Data Tracking**: Uses DVC to version and store the dataset in an Amazon S3 bucket.
13-
- **Remote Tracking Server**: Deploys an MLflow remote tracking server on DagsHub for centralized experiment tracking.
14-
- **Automated Pipeline Development**: Configures GitHub Actions to automate the ML pipeline, testing, and deployment processes.
15-
- **Unit Testing**: Adds unit test cases for the Flask API, model loading, and model signature to ensure robustness.
16-
- **Production Deployment**: Automatically promotes models to the production stage in MLflow if they pass all tests.
10+
- **Experiment Tracking**: Logs all training runs with parameters, metrics, and artifacts in MLflow.
11+
- **Hyperparameter Tuning**: Uses MLflow to log and compare performance during hyperparameter optimization.
12+
- **ML Pipeline with DVC**: Structures and manages machine learning pipelines, ensuring reproducibility.
13+
- **Model Registration**: Registers the best-performing models for deployment using MLflow.
14+
- **Data Versioning**: Tracks and versions datasets with DVC, storing them in Amazon S3.
15+
- **Remote Experiment Tracking**: Hosts a centralized MLflow tracking server on DagsHub.
16+
- **Automated CI/CD Pipelines**: Leverages GitHub Actions to automate testing, pipeline execution, and deployment.
17+
- **Unit Testing**: Validates API endpoints, model loading, and configurations to ensure robust deployments.
18+
- **AWS CodeDeploy with Blue-Green Deployment**: Deploys the application using AWS CodeDeploy to minimize downtime.
19+
- **AWS ECR Integration**: Stores and retrieves Docker images for deployment.
20+
- **Production Deployment**: Automates testing and model promotion to production, ensuring deployment readiness.
21+
- **Scalability Features**:
22+
- **Auto Scaling Groups (ASGs)**: Automatically adjusts the number of EC2 instances based on traffic and system load.
23+
- **Load Balancers**: Distributes traffic evenly across instances to ensure high availability and fault tolerance.
24+
- **Launch Templates**: Defines instance configurations for easy scaling and reproducibility.
1725

1826
---
1927

2028
## Key Features
2129

22-
### 1. Experiment Tracking with MLflow
23-
- All experiments, including hyperparameters and metrics, are logged to an MLflow server hosted on DagsHub.
24-
- This helps in tracking and comparing various model runs for better reproducibility and performance analysis.
25-
26-
### 2. Hyperparameter Tuning
27-
- Used MLflow’s tracking capabilities to monitor hyperparameter tuning experiments.
28-
- Selected the best model configuration based on tracked metrics and stored it in the MLflow Model Registry.
29-
30-
### 3. ML Pipeline Creation with DVC
31-
- DVC (Data Version Control) is used to define and manage a structured ML pipeline.
32-
- Each stage, from data preprocessing to model training, is defined in DVC, ensuring reproducibility.
33-
- The final model is selected based on the best hyperparameter configuration and added to the DVC pipeline.
34-
35-
### 4. Model Registration on MLflow
36-
- The best-performing model from the experiments is registered in MLflow, ensuring easy access for deployment.
37-
- Staging and production environments are defined to manage the model lifecycle effectively.
38-
39-
### 5. Data Tracking with DVC and S3
40-
- The dataset is versioned using DVC, and files are stored in an Amazon S3 bucket.
41-
- This approach provides efficient data management, enabling version control and rollback if needed.
30+
### 1. **Experiment Tracking with MLflow**
31+
- Logs all experiments, hyperparameters, metrics, and artifacts to an MLflow server hosted on DagsHub.
32+
- Simplifies comparison and selection of the best-performing models.
33+
34+
### 2. **Hyperparameter Tuning**
35+
- Uses MLflow’s tracking capabilities for hyperparameter tuning.
36+
- Tracks each experiment run and selects the best configuration for deployment.
37+
38+
### 3. **Structured ML Pipeline with DVC**
39+
- Employs DVC to define and manage an end-to-end ML pipeline from data ingestion to model training.
40+
- Tracks all pipeline stages, ensuring reproducibility and efficient updates.
41+
42+
### 4. **Model Registration in MLflow**
43+
- Registers the best-performing models to MLflow Model Registry.
44+
- Supports staging and production environments for model lifecycle management.
45+
46+
### 5. **Data Versioning with DVC and S3**
47+
- Tracks data changes and stores datasets securely in an Amazon S3 bucket.
48+
- Allows easy rollback and version comparison for datasets.
49+
50+
### 6. **Scalable Deployment with AWS**
51+
- **Auto Scaling Groups (ASGs)**:
52+
- Dynamically adjusts the number of EC2 instances based on predefined scaling policies (e.g., CPU usage, memory usage).
53+
- Ensures cost efficiency by scaling down during low traffic and scaling up during peak traffic.
54+
- **Load Balancers**:
55+
- Elastic Load Balancer (ELB) ensures that incoming traffic is evenly distributed across all running instances.
56+
- Provides fault tolerance by automatically routing traffic away from unhealthy instances.
57+
- **Launch Templates**:
58+
- Predefined configurations for EC2 instances, including AMIs, instance types, security groups, and networking settings.
59+
- Simplifies instance management and ensures consistency across scaling operations.
60+
61+
### 7. **Automated Deployment with AWS CodeDeploy**
62+
- Implements blue-green deployment for seamless updates to Auto Scaling Groups (ASGs) behind a load balancer.
63+
- Ensures minimal downtime and safe transitions between application versions.
64+
65+
### 8. **Integration with AWS ECR**
66+
- After passing all tests, Docker images are built and pushed to AWS Elastic Container Registry (ECR).
67+
- CodeDeploy pulls these images for deployment to ASGs.
68+
69+
### 9. **CI/CD with GitHub Actions**
70+
- Automates the workflow for testing, building, and deploying updates.
71+
- Triggers deployment only after passing all unit tests and validations.
72+
73+
### 10. **Unit Testing**
74+
- Comprehensive unit tests for:
75+
- Flask API endpoints
76+
- Model loading
77+
- Model signature validation
78+
- Ensures reliability before deployment.
4279

43-
### 6. MLflow Remote Tracking Server on DagsHub
44-
- MLflow tracking server is deployed on DagsHub, allowing centralized and remote tracking for experiments.
45-
- Makes it easier to collaborate, monitor, and share experiment results.
80+
---
4681

47-
### 7. Automated Pipeline Development with GitHub Actions
48-
- GitHub Actions is configured to automate the pipeline development and testing workflows.
49-
- Each update to the pipeline or model triggers a build-and-test process, ensuring continuous integration and delivery.
82+
## Deployment Workflow
5083

51-
### 8. Unit Testing
52-
- Unit tests are added for:
53-
- Flask API endpoints
54-
- Model loading function
55-
- Model signature validation
56-
- Ensures that each component behaves as expected before deployment.
84+
1. **Build and Push to AWS ECR**:
85+
- After successful testing, the application is containerized using Docker.
86+
- The Docker image is pushed to AWS ECR for centralized storage.
5787

58-
### 9. Production Deployment
59-
- If the model passes all test cases, it is automatically pushed to the production stage in MLflow.
60-
- This step automates the final promotion to production, reducing manual intervention.
88+
2. **Automated Deployment**:
89+
- AWS CodeDeploy retrieves the Docker image from ECR.
90+
- Deploys the application to ASGs using blue-green deployment to minimize downtime.
91+
- Load balancers ensure high availability, routing traffic only to healthy instances.
6192

62-
---
93+
3. **Scaling and Traffic Management**:
94+
- ASGs adjust the number of instances based on traffic patterns.
95+
- Load balancers handle incoming requests and distribute them to available instances, ensuring optimal performance.
6396

64-
## Project Structure
65-
66-
```plaintext
67-
├── LICENSE
68-
├── Makefile <- Makefile with commands like `make data` or `make train`
69-
├── README.md <- The top-level README for developers using this project.
70-
├── data
71-
│   ├── external <- Data from third party sources.
72-
│   ├── interim <- Intermediate data that has been transformed.
73-
│   ├── processed <- The final, canonical data sets for modeling.
74-
│   └── raw <- The original, immutable data dump.
75-
76-
├── docs <- A default Sphinx project; see sphinx-doc.org for details
77-
78-
├── models <- Trained and serialized models, model predictions, or model summaries
79-
80-
├── notebooks <- Jupyter notebooks. Naming convention is a number (for ordering),
81-
│ the creator's initials, and a short `-` delimited description, e.g.
82-
│ `1.0-jqp-initial-data-exploration`.
83-
84-
├── references <- Data dictionaries, manuals, and all other explanatory materials.
85-
86-
├── reports <- Generated analysis as HTML, PDF, LaTeX, etc.
87-
│   └── figures <- Generated graphics and figures to be used in reporting
88-
89-
├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
90-
│ generated with `pip freeze > requirements.txt`
91-
92-
├── setup.py <- makes project pip installable (pip install -e .) so src can be imported
93-
├── src <- Source code for use in this project.
94-
│   ├── __init__.py <- Makes src a Python module
95-
│ │
96-
│   ├── data <- Scripts to download or generate data
97-
│   │   └── make_dataset.py
98-
│ │
99-
│   ├── features <- Scripts to turn raw data into features for modeling
100-
│   │   └── build_features.py
101-
│ │
102-
│   ├── models <- Scripts to train models and then use trained models to make
103-
│ │ │ predictions
104-
│   │   ├── predict_model.py
105-
│   │   └── train_model.py
106-
│ │
107-
│   └── visualization <- Scripts to create exploratory and results oriented visualizations
108-
│   └── visualize.py
109-
110-
└── tox.ini <- tox file with settings for running tox; see tox.readthedocs.io
111-
```
97+
4. **Continuous Integration/Delivery**:
98+
- GitHub Actions automatically trigger the deployment pipeline on new commits.
11299

113100
---
114101

115102
## Setup
116103

117104
1. **Clone the Repository**:
118105
```bash
119-
git clone [https://github.com/username/text-classification-mlops.git](https://github.com/2003HARSH/Text-Classification-using-MLOps.git)
120-
cd Text-Classification-using-MLOps.git
106+
git clone https://github.com/2003HARSH/Text-Classification-using-MLOps.git
107+
cd Text-Classification-using-MLOps
121108
```
122109

123-
2. **Install Requirements**:
110+
2. **Install Dependencies**:
124111
```bash
125112
pip install -r requirements.txt
126113
```
127114

128-
3. **Set Up MLflow Remote Server** (DagsHub):
129-
- Follow DagsHub instructions to connect your repository and configure the MLflow server.
115+
3. **Set Up AWS Services**:
116+
- **Auto Scaling Groups (ASGs)**: Define scaling policies for EC2 instances.
117+
- **Load Balancers**: Configure ELB to distribute traffic across instances.
118+
- **Launch Templates**: Create templates for consistent instance configurations.
130119

131-
4. **Configure DVC with S3**:
132-
- Set up DVC with S3 to store and version datasets:
133-
```bash
134-
dvc remote add -d s3remote s3://your-bucket-name/path
135-
```
120+
4. **Configure AWS CodeDeploy**:
121+
- Set up a CodeDeploy application with blue-green deployment using ASGs and a load balancer.
122+
123+
5. **Push Docker Image to ECR**:
124+
```bash
125+
aws ecr get-login-password --region <region> | docker login --username AWS --password-stdin <account_id>.dkr.ecr.<region>.amazonaws.com
126+
docker build -t text-classification .
127+
docker tag text-classification:latest <account_id>.dkr.ecr.<region>.amazonaws.com/text-classification:latest
128+
docker push <account_id>.dkr.ecr.<region>.amazonaws.com/text-classification:latest
129+
```
136130

137131
---
138132

@@ -162,38 +156,27 @@ This repository includes:
162156

163157
---
164158

165-
## CI/CD with GitHub Actions
166-
167-
GitHub Actions automates the following:
168-
169-
- Pipeline execution and testing
170-
- Data tracking and artifact logging
171-
- Model deployment to production in MLflow
172-
173-
**Workflow files** are stored in `.github/workflows`, where each push triggers the CI/CD pipeline.
174-
175-
---
176-
177159
## Testing
178160

179161
- Run unit tests locally:
180162
```bash
181163
python -m unittest <test_file_name>.py
182164
```
183-
- GitHub Actions runs these tests automatically to ensure quality before production deployment.
165+
- CI/CD workflows execute these tests automatically.
184166

185167
---
186168

187-
## License
188-
189-
This project is licensed under the MIT License.
169+
## Future Enhancements
190170

191-
---
192-
193-
## Acknowledgments
171+
1. **Enhanced Deployment**:
172+
- Deployment of the application using AWS Elastic Container Service (ECS) for scaling and fault tolerance.
173+
- Further integration with AWS CodePipeline to orchestrate the end-to-end deployment process.
194174

195-
Thanks to [DagsHub](https://dagshub.com/) for remote MLflow tracking support and [Amazon S3](https://aws.amazon.com/s3/) for data storage.
175+
2. **Model Monitoring**:
176+
- Integration of tools for monitoring model performance in production and detecting drift.
196177

197178
---
198179

199-
Feel free to reach out for collaboration or to report issues!
180+
## Contact
181+
182+
Feel free to reach out at [harshnkgupta@gmail.com](harshnkgupta@gmail.com) or create an issue in the repository for questions or collaboration opportunities!

0 commit comments

Comments
 (0)