This repository demonstrates deploying ML models on Kubernetes using KServe and Kubeflow Lite components.
For a step-by-step walkthrough, check out demo.md which provides detailed instructions and examples.
- Deploy and scale ML models with KServe
- Implement canary deployments and A/B testing
- Monitor model performance and health
- Secure model endpoints
- macOS with Homebrew
- Docker Desktop
- Python 3.9+
- kubectl
- Install dependencies:
./setup/install-dependencies.sh
- Create cluster and install components:
./scripts/setup_cluster.sh
- Train models:
pip install -r requirements.txt
python models/train_model_v1.py
python models/train_model_v2.py
- Deploy model:
kubectl apply -f kubernetes/kserve_deployment.yaml
export SERVICE_URL=$(kubectl get inferenceservice sentiment-classifier -o jsonpath='{.status.url}')
- Test deployment:
python scripts/test_model.py --url $SERVICE_URL
- Model deployment and serving
- Automatic scaling based on traffic
- Canary deployments
- Authentication and security
- Metrics and monitoring
See the docs/ folder for detailed guides on:
- Autoscaling configuration
- Canary deployment strategies
- Authentication setup
- Monitoring and observability
├── kubernetes/ # K8s deployment files
├── models/ # ML model code
├── scripts/ # Utility scripts
└── docs/ # Detailed documentation
kind delete cluster --name kserve-demo
Check the troubleshooting guide or open an issue.