Skip to content

kaito-project/kaito-kubectl-plugin

kaito-kubectl-plugin

License codecov

A kubectl plugin for deploying and managing AI/ML models using the Kubernetes AI Toolchain Operator (Kaito).

Overview

kubectl-kaito simplifies AI model deployment on Kubernetes by providing an intuitive command-line interface that abstracts away complex YAML configurations. Deploy, manage, and interact with large language models and other AI workloads with simple commands.

Features

kubectl-kaito Demo

  • One-command deployment Deploy AI models with a single command that automatically provisions GPU nodes and configures the inference stack
  • Real-time monitoring Monitor workspace deployment status with real-time conditions, NodeClaim tracking, and detailed health checks
  • OpenAI-compatible APIs Interact with deployed models through an OpenAI-compatible chat interface with customizable system prompts
  • Model discovery Browse and discover Kaito pre-configured AI models with detailed specifications and GPU requirements
  • Seamless endpoint access Access inference endpoints automatically using Kubernetes API proxy - works anywhere kubectl works without manual setup

Quick Start

# List available models
kubectl kaito models list

# Deploy a model for inference
kubectl kaito deploy --workspace-name my-workspace \
--model phi-3.5-mini-instruct \
--instance-type Standard_NC6s_v3

# Check deployment status
kubectl kaito status --workspace-name my-workspace

# Get inference endpoint
kubectl kaito get-endpoint --workspace-name my-workspace

# Start interactive chat
kubectl kaito chat --workspace-name my-workspace

Installation

Prerequisites

  • Kubernetes cluster with GPU nodes
  • Kaito operator installed in your cluster
  • kubectl configured to access your cluster

Install via Krew

Prerequisites: Install krew if you haven't already.

From Krew Index

kubectl krew install kaito

Generate Krew Manifest Locally From specific release tag

# Get the script
curl -sO https://raw.githubusercontent.com/kaito-project/kaito-kubectl-plugin/refs/heads/main/hack/generate-krew-manifest.sh

export RELEASE_TAG=v0.1.1
# Generate manifest for a specific version with real SHA256 values
chmod +x ./generate-krew-manifest.sh && ./generate-krew-manifest.sh $RELEASE_TAG

# Install the generated manifest
kubectl krew install --manifest=krew/kaito-$RELEASE_TAG.yaml

Verify Installation

kubectl kaito --help

Usage Examples

Basic Model Deployment

# Deploy Phi-3.5 Mini for general inference
kubectl kaito deploy \
  --workspace-name phi-workspace \
  --model phi-3.5-mini-instruct \
  --instance-type Standard_NC6s_v3

# Monitor deployment
kubectl kaito status --workspace-name phi-workspace --watch

# Test the deployment
kubectl kaito chat --workspace-name phi-workspace

Fine-tuning Workflow

# Fine-tune a model with your data
kubectl kaito deploy \
  --workspace-name tune-phi \
  --model phi-3.5-mini-instruct \
  --tuning \
  --tuning-method qlora \
  --input-urls "https://example.com/training-data.parquet" \
  --output-image "myregistry.azurecr.io/phi-tuned:v1" \
  --output-image-secret my-registry-secret

# Deploy the fine-tuned model
kubectl kaito deploy \
  --workspace-name phi-tuned \
  --model phi-3.5-mini-instruct \
  --adapters phi-adapter="myregistry.azurecr.io/phi-tuned:v1"

Multi-GPU Large Model Deployment

# Deploy Llama-2 70B across multiple nodes
kubectl create secret generic hf-token --from-literal=token=your_token

kubectl kaito deploy \
  --workspace-name large-llama \
  --model llama-2-70b \
  --model-access-secret hf-token \
  --instance-type Standard_NC24ads_A100_v4 \
  --count 4

Commands

Command Description
deploy Deploy a Kaito workspace for model inference or fine-tuning
status Check status of Kaito workspaces
get-endpoint Get inference endpoints for a workspace
chat Interactive chat with deployed AI models
models Manage and list supported AI models

Documentation

📖 Complete Documentation

Development

Build from Source

# Clone the repository
git clone https://github.com/kaito-project/kaito-kubectl-plugin.git
cd kaito-kubectl-plugin

# Build the plugin
make build

# Make sure to uninstall the krew plugin to be able to run the local binary
kubectl krew uninstall kaito

# Run the cli from the local binary
./bin/kubectl-kaito --help

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

About

A kubectl plugin for deploying and managing AI/ML models using Kaito.

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •