Ollama manifests to deploy in a Kubernetes cluster
-
Create kind cluster, make sure there enough cpu and memory in respective vm/docker desktop/podman desktop resource settings. In this example I am using multi node cluster, you can single node cluster if wanted. It will take sometime to get the deployment ready.
kind create cluster --config kind-config.yaml
-
Deploy ollama and expose the service
kubectl create -f deploy/deployment.yaml
-
Port-forward the service for interacting with ollama running in cluster.
Note: If there is a port conflicct use a different port than 8000
kubectl -n ollama port-forward svc/ollama 8000
-
Download a model
curl http://localhost:8000/api/pull -d '{ "model": "llama3.2" }'
-
See the downloaded model
curl http://localhost:8000/api/tags
-
Try chatting with the model
curl http://localhost:8000/api/generate -d '{ "model": "llama3.2", "prompt": "What is capital of India?" }'
Now you have installed and configured ollama in a kubernetes cluster.