kaito-kubectl-plugin

A kubectl plugin for deploying and managing AI/ML models using the Kubernetes AI Toolchain Operator (Kaito).

Overview

kubectl-kaito simplifies AI model deployment on Kubernetes by providing an intuitive command-line interface that abstracts away complex YAML configurations. Deploy, manage, and interact with large language models and other AI workloads with simple commands.

Features

One-command deployment Deploy AI models with a single command that automatically provisions GPU nodes and configures the inference stack
Real-time monitoring Monitor workspace deployment status with real-time conditions, NodeClaim tracking, and detailed health checks
OpenAI-compatible APIs Interact with deployed models through an OpenAI-compatible chat interface with customizable system prompts
Model discovery Browse and discover Kaito pre-configured AI models with detailed specifications and GPU requirements
Seamless endpoint access Access inference endpoints automatically using Kubernetes API proxy - works anywhere kubectl works without manual setup

Quick Start

# List available models
kubectl kaito models list

# Deploy a model for inference
kubectl kaito deploy --workspace-name my-workspace \
--model phi-3.5-mini-instruct \
--instance-type Standard_NC6s_v3

# Check deployment status
kubectl kaito status --workspace-name my-workspace

# Get inference endpoint
kubectl kaito get-endpoint --workspace-name my-workspace

# Start interactive chat
kubectl kaito chat --workspace-name my-workspace

Installation

Prerequisites

Kubernetes cluster with GPU nodes
Kaito operator installed in your cluster
kubectl configured to access your cluster

Install via Krew

Prerequisites: Install krew if you haven't already.

From Krew Index

kubectl krew install kaito

Generate Krew Manifest Locally From specific release tag

# Get the script
curl -sO https://raw.githubusercontent.com/kaito-project/kaito-kubectl-plugin/refs/heads/main/hack/generate-krew-manifest.sh

export RELEASE_TAG=v0.1.1
# Generate manifest for a specific version with real SHA256 values
chmod +x ./generate-krew-manifest.sh && ./generate-krew-manifest.sh $RELEASE_TAG

# Install the generated manifest
kubectl krew install --manifest=krew/kaito-$RELEASE_TAG.yaml

Verify Installation

kubectl kaito --help

Usage Examples

Basic Model Deployment

# Deploy Phi-3.5 Mini for general inference
kubectl kaito deploy \
  --workspace-name phi-workspace \
  --model phi-3.5-mini-instruct \
  --instance-type Standard_NC6s_v3

# Monitor deployment
kubectl kaito status --workspace-name phi-workspace --watch

# Test the deployment
kubectl kaito chat --workspace-name phi-workspace

Fine-tuning Workflow

# Fine-tune a model with your data
kubectl kaito deploy \
  --workspace-name tune-phi \
  --model phi-3.5-mini-instruct \
  --tuning \
  --tuning-method qlora \
  --input-urls "https://example.com/training-data.parquet" \
  --output-image "myregistry.azurecr.io/phi-tuned:v1" \
  --output-image-secret my-registry-secret

# Deploy the fine-tuned model
kubectl kaito deploy \
  --workspace-name phi-tuned \
  --model phi-3.5-mini-instruct \
  --adapters phi-adapter="myregistry.azurecr.io/phi-tuned:v1"

Multi-GPU Large Model Deployment

# Deploy Llama-2 70B across multiple nodes
kubectl create secret generic hf-token --from-literal=token=your_token

kubectl kaito deploy \
  --workspace-name large-llama \
  --model llama-2-70b \
  --model-access-secret hf-token \
  --instance-type Standard_NC24ads_A100_v4 \
  --count 4

Commands

Command	Description
`deploy`	Deploy a Kaito workspace for model inference or fine-tuning
`status`	Check status of Kaito workspaces
`get-endpoint`	Get inference endpoints for a workspace
`chat`	Interactive chat with deployed AI models
`models`	Manage and list supported AI models

Documentation

📖 Complete Documentation

Development

Build from Source

# Clone the repository
git clone https://github.com/kaito-project/kaito-kubectl-plugin.git
cd kaito-kubectl-plugin

# Build the plugin
make build

# Make sure to uninstall the krew plugin to be able to run the local binary
kubectl krew uninstall kaito

# Run the cli from the local binary
./bin/kubectl-kaito --help

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.github		.github
cmd/kubectl-kaito		cmd/kubectl-kaito
docs		docs
hack		hack
krew		krew
pkg		pkg
tests/e2e		tests/e2e
.gitignore		.gitignore
.golangci.yaml		.golangci.yaml
.goreleaser.yml		.goreleaser.yml
.pre-commit-config.yaml		.pre-commit-config.yaml
.yamllint.yml		.yamllint.yml
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
GOVERNANCE.md		GOVERNANCE.md
LICENSE		LICENSE
MAINTAINERS.md		MAINTAINERS.md
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
codecov.yml		codecov.yml
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

kaito-kubectl-plugin

Overview

Features

Quick Start

Installation

Prerequisites

Install via Krew

From Krew Index

Generate Krew Manifest Locally From specific release tag

Verify Installation

Usage Examples

Basic Model Deployment

Fine-tuning Workflow

Multi-GPU Large Model Deployment

Commands

Documentation

Development

Build from Source

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

kaito-project/kaito-kubectl-plugin

Folders and files

Latest commit

History

Repository files navigation

kaito-kubectl-plugin

Overview

Features

Quick Start

Installation

Prerequisites

Install via Krew

From Krew Index

Generate Krew Manifest Locally From specific release tag

Verify Installation

Usage Examples

Basic Model Deployment

Fine-tuning Workflow

Multi-GPU Large Model Deployment

Commands

Documentation

Development

Build from Source

License

About

Topics

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages