Skip to content

In this project, we build an Apache Airflow pipeline to load, train, and evaluate a ML model, store it, and use it for inferencing by deploying the model with a sleek Streamlit UI, Docker, and auto-scale it with Kubernetes as container orchestration tool πŸŽ“

License

Notifications You must be signed in to change notification settings

iQuantC/Airflow-Pipeline-Docker-Kubernetes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Deployment of ML Model with Apache Airflow DAG Pipeline, Docker, Kubernetes & Streamlit UI

Welcome to this beginner-friendly tutorial where we dive into the world of MLOps by building a complete Machine Learning pipeline! In this video, we walk you through creating an Apache Airflow pipeline to load, train, and evaluate a ML model, save it, and make predictions. Then, we deploy the model with a sleek Streamlit UI, containerize it with Docker, and scale it with Kubernetes. Perfect for anyone starting their MLOps journey! πŸŽ“

YouTube Link: https://youtu.be/SrvhSMeMsio?si=WJcLwpapJsCMQSNJ

Requirements

  1. Apache Airflow
  2. Python
  3. Streamlit
  4. Docker
  5. DockerHub
  6. Kubernetes
  7. Kubectl

Set up Environment

sudo apt update && sudo apt upgrade -y
sudo apt install python3-pip -y

Create Python Virtual Environment & Activate it

python -m venv venv
source venv/bin/activate
pip3 install apache-airflow flask_appbuilder apache-airflow-providers-fab streamlit scikit-learn pandas joblib 

Set up Airflow

Initialize the Airflow

airflow version

Set & Verify Airflow Home

export AIRFLOW_HOME=~/airflow
echo $AIRFLOW_HOME

Confirm Existence of Airflow DB and Configuration File

ls -l ~/airflow

Create an Admin User for Airflow Web UI

  1. Replace "auth_manager = airflow.api_fastapi.auth.managers.simple.simple_auth_manager.SimpleAuthManager" in airflow.cfg file with "auth_manager=airflow.providers.fab.auth_manager.fab_auth_manager.FabAuthManager".
  2. Or you can comment the earlier variable.
vim ~/airflow/airflow.cfg

Create Airflow DAGS Directory

mkdir -p ~/airflow/dags

Make sure the DAG is in the Airflow DAGS Directory

cp iris_model_pipeline_dag.py ~/airflow/dags/
ls ~/airflow/dags/

Start the Airflow Scheduler, Api-Server, DB, Create Admin User - Starts all Components or Services of Airflow (For Dev Environment)

airflow standalone

On your browser:

localhost:8080

Get Admin User Password:

cat ~/airflow/simple_auth_manager_passwords.json.generated

Build the DAG Pipeline

python ~/airflow/dags/iris_model_pipeline_dag.py

On Airflow UI

  1. Look for the DAG Pipeline named: iris_model_pipeline
  2. Toggle the switch to "ON".
  3. Click the "Trigger DAG" button (the play icon) to start run.
  4. Monitor the run (in the "Graph" or "Grid" view).

Once DAG Pipeline run is successful, the model artifact (.pkl file) is saved to location: /tmp

ls -ld /tmp/
ls /tmp/iris_logistic_model.pkl

Prediction based on Sample Inputs in Script

sample_data = [[5.1, 3.5, 1.4, 0.2], [6.7, 3.0, 5.2, 2.3]]  # Sample inputs
cat /tmp/iris_predictions.csv

Load Model & Make Prediction on on Streamlit UI

streamlit run app.py

Build Docker Image of the Trained Model & Its Dependencies In One Artifact

Copy ML Model Artifact (.pkl file) to PWD

cp /tmp/iris_logistic_model.pkl .
docker build -t ml-airflow-streamlit-app .

Create PAT and Log in to Your DockerHub Account

If it fails, go to your DockerHub account and create a Personal Access Token (PAT) - This will be your login password

docker login -u iquantc

Try Docker Build Again

docker build -t ml-airflow-streamlit-app .
docker run -p 8501:8501 ml-airflow-streamlit-app

On your browser - Use Network URL or LocalHost

http://172.17.0.2:8501
http://localhost:8501

Tag Your Local Docker Image

docker tag ml-airflow-streamlit-app:latest iquantc/ml-airflow-streamlit-app:latest

Push the Image to DockerHub

docker push iquantc/ml-airflow-streamlit-app:latest

Deploy ML Model Docker Image to Kubernetes

Review manifest files

Create Minikube cluster

minikube start --driver=docker

Deploy the Kubernetes Manifest Files

Review the deployment manifest

kubectl apply -f deployment.yaml

Check the Resources Created

kubectl get pods
kubectl get svc
minikube ip

Open in Browser

http://<minikube-ip>:<NodePort>

Clean up

minikube stop
minikube delete --all

About

In this project, we build an Apache Airflow pipeline to load, train, and evaluate a ML model, store it, and use it for inferencing by deploying the model with a sleek Streamlit UI, Docker, and auto-scale it with Kubernetes as container orchestration tool πŸŽ“

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published