GitHub - cactus-compute/cactus: Cross-platform framework for deploying LLM/VLM/TTS models locally on smartphones.

🌍 Translations

Cross-platform framework for deploying LLM/VLM/TTS models locally in your app.

Available in Flutter, React-Native and Kotlin Multiplatform.
Supports any GGUF model you can find on Huggingface; Qwen, Gemma, Llama, DeepSeek etc.
Run LLMs, VLMs, Embedding Models, TTS models and more.
Accommodates from FP32 to as low as 2-bit quantized models, for efficiency and less device strain.
Chat templates with Jinja2 support and token streaming.

CLICK TO JOIN OUR DISCORD!

CLICK TO VISUALISE AND QUERY REPO

Install: Execute the following command in your project terminal:
```
flutter pub add cactus
```

Flutter Text Completion

import 'package:cactus/cactus.dart';

final lm = await CactusLM.init(
    modelUrl: 'https://huggingface.co/Cactus-Compute/Qwen3-600m-Instruct-GGUF/resolve/main/Qwen3-0.6B-Q8_0.gguf',
    contextSize: 2048,
);

final messages = [ChatMessage(role: 'user', content: 'Hello!')];
final response = await lm.completion(messages, maxTokens: 100, temperature: 0.7);

Flutter Embedding

import 'package:cactus/cactus.dart';

final lm = await CactusLM.init(
    modelUrl: 'https://huggingface.co/Cactus-Compute/Qwen3-600m-Instruct-GGUF/resolve/main/Qwen3-0.6B-Q8_0.gguf',
    contextSize: 2048,
    generateEmbeddings: True,
);

final text = 'Your text to embed';
final result = await lm.embedding(text);

Flutter VLM Completion

import 'package:cactus/cactus.dart';

final vlm = await CactusVLM.init(
    modelUrl: 'https://huggingface.co/Cactus-Compute/SmolVLM2-500m-Instruct-GGUF/resolve/main/SmolVLM2-500M-Video-Instruct-Q8_0.gguf',
    mmprojUrl: 'https://huggingface.co/Cactus-Compute/SmolVLM2-500m-Instruct-GGUF/resolve/main/mmproj-SmolVLM2-500M-Video-Instruct-Q8_0.gguf',
);

final messages = [ChatMessage(role: 'user', content: 'Describe this image')];

final response = await vlm.completion(
    messages, 
    imagePaths: ['/absolute/path/to/image.jpg'],
    maxTokens: 200,
    temperature: 0.3,
);

Flutter Cloud Fallback

import 'package:cactus/cactus.dart';

final lm = await CactusLM.init(
    modelUrl: 'https://huggingface.co/Cactus-Compute/Qwen3-600m-Instruct-GGUF/resolve/main/Qwen3-0.6B-Q8_0.gguf',
    contextSize: 2048,
    cactusToken: 'enterprise_token_here', 
);

final messages = [ChatMessage(role: 'user', content: 'Hello!')];
final response = await lm.completion(messages, maxTokens: 100, temperature: 0.7);

// local (default): strictly only run on-device
// localfirst: fallback to cloud if device fails
// remotefirst: primarily remote, run local if API fails
// remote: strictly run on cloud 
final embedding = await lm.embedding('Your text', mode: 'localfirst');

N/B: See the Flutter Docs for more.

Install the cactus-react-native package:

npm install cactus-react-native && npx pod-install

React-Native Text Completion

import { CactusLM } from 'cactus-react-native';

const { lm, error } = await CactusLM.init({
    model: '/path/to/model.gguf', // this is a local model file inside the app sandbox
    n_ctx: 2048,
});

const messages = [{ role: 'user', content: 'Hello!' }];
const params = { n_predict: 100, temperature: 0.7 };
const response = await lm.completion(messages, params);

React-Native Embedding

import { CactusLM } from 'cactus-react-native';

const { lm, error } = await CactusLM.init({
    model: '/path/to/model.gguf', // local model file inside the app sandbox
    n_ctx: 2048,
    embedding: True,
});

const text = 'Your text to embed';
const params = { normalize: True };
const result = await lm.embedding(text, params);

React-Native VLM

import { CactusVLM } from 'cactus-react-native';

const { vlm, error } = await CactusVLM.init({
    model: '/path/to/vision-model.gguf', // local model file inside the app sandbox
    mmproj: '/path/to/mmproj.gguf', // local model file inside the app sandbox
});

const messages = [{ role: 'user', content: 'Describe this image' }];

const params = {
    images: ['/absolute/path/to/image.jpg'],
    n_predict: 200,
    temperature: 0.3,
};

const response = await vlm.completion(messages, params);

React-Native Agents

import { CactusAgent } from 'cactus-react-native';

// we recommend Qwen 3 family, 0.6B is great
const { agent, error } = await CactusAgent.init({
    model: '/path/to/model.gguf', 
    n_ctx: 2048,
});

const weatherTool = agent.addTool(
    (location: string) => `Weather in ${location}: 72°F, sunny`,
    'Get current weather for a location',
    {
        location: { type: 'string', description: 'City name', required: true }
    }
);

const messages = [{ role: 'user', content: 'What\'s the weather in NYC?' }];
const result = await agent.completionWithTools(messages, {
n_predict: 200,
temperature: 0.7,
});

await agent.release();

Get started with an example app built using CactusAgent.

See the React Docs for more.

Add Maven Dependency: Add to your KMP project's build.gradle.kts:

kotlin {
    sourceSets {
        commonMain {
            dependencies {
                implementation("com.cactus:library:0.2.4")
            }
        }
    }
}

Platform Setup:
- Android: Works automatically - native libraries included.
- iOS: In Xcode: File → Add Package Dependencies → Paste https://github.com/cactus-compute/cactus → Click Add

Kotlin Multiplatform Text Completion

import com.cactus.CactusLM
import kotlinx.coroutines.runBlocking

runBlocking {
    val lm = CactusLM(
        threads = 4,
        contextSize = 2048,
        gpuLayers = 0 // Set to 99 for full GPU offload
    )
    
    val downloadSuccess = lm.download(
        url = "path/to/hugginface/gguf",
        filename = "model_filename.gguf"
    )
    val initSuccess = lm.init("qwen3-600m.gguf")
    
    val result = lm.completion(
        prompt = "Hello!",
        maxTokens = 100,
        temperature = 0.7f
    )
}

Kotlin Multiplatform Speech To Text

import com.cactus.CactusSTT
import kotlinx.coroutines.runBlocking

runBlocking {
    val stt = CactusSTT(
        language = "en-US",
        sampleRate = 16000,
        maxDuration = 30
    )
    
    // Only supports default Vosk STT model for Android & Apple FOundation Model
    val downloadSuccess = stt.download()
    val initSuccess = stt.init()
    
    val result = stt.transcribe()
    result?.let { sttResult ->
        println("Transcribed: ${sttResult.text}")
        println("Confidence: ${sttResult.confidence}")
    }
    
    // Or transcribe from audio file
    val fileResult = stt.transcribeFile("/path/to/audio.wav")
}

Kotlin Multiplatform VLM

import com.cactus.CactusVLM
import kotlinx.coroutines.runBlocking

runBlocking {
    val vlm = CactusVLM(
        threads = 4,
        contextSize = 2048,
        gpuLayers = 0 // Set to 99 for full GPU offload
    )
    
    val downloadSuccess = vlm.download(
        modelUrl = "path/to/hugginface/gguf",
        mmprojUrl = "path/to/hugginface/mmproj/gguf",
        modelFilename = "model_filename.gguf",
        mmprojFilename = "mmproj_filename.gguf"
    )
    val initSuccess = vlm.init("smolvlm2-500m.gguf", "mmproj-smolvlm2-500m.gguf")
    
    val result = vlm.completion(
        prompt = "Describe this image",
        imagePath = "/path/to/image.jpg",
        maxTokens = 200,
        temperature = 0.3f
    )
}

N/B: See the Kotlin Docs for more.

Cactus backend is written in C/C++ and can run directly on phones, smart tvs, watches, speakers, cameras, laptops etc. See the C++ Docs for more.

First, clone the repo with git clone https://github.com/cactus-compute/cactus.git, cd into it and make all scripts executable with chmod +x scripts/*.sh

Flutter
- Build the Android JNILibs with scripts/build-flutter-android.sh.
- Build the Flutter Plugin with scripts/build-flutter.sh. (MUST run before using example)
- Navigate to the example app with cd flutter/example.
- Open your simulator via Xcode or Android Studio, walkthrough if you have not done this before.
- Always start app with this combo flutter clean && flutter pub get && flutter run.
- Play with the app, and make changes either to the example app or plugin as desired.
React Native
- Build the Android JNILibs with scripts/build-react-android.sh.
- Build the Flutter Plugin with scripts/build-react.sh.
- Navigate to the example app with cd react/example.
- Setup your simulator via Xcode or Android Studio, walkthrough if you have not done this before.
- Always start app with this combo yarn && yarn ios or yarn && yarn android.
- Play with the app, and make changes either to the example app or package as desired.
- For now, if changes are made in the package, you would manually copy the files/folders into the examples/react/node_modules/cactus-react-native.
Kotlin Multiplatform
- Build the Android JNILibs with scripts/build-flutter-android.sh. (Flutter & Kotlin share same JNILibs)
- Build the Kotlin library with scripts/build-kotlin.sh. (MUST run before using example)
- Navigate to the example app with cd kotlin/example.
- Open your simulator via Xcode or Android Studio, walkthrough if you have not done this before.
- Always start app with ./gradlew :composeApp:run for desktop or use Android Studio/Xcode for mobile.
- Play with the app, and make changes either to the example app or library as desired.
C/C++
- Navigate to the example app with cd cactus/example.
- There are multiple main files main_vlm, main_llm, main_embed, main_tts.
- Build both the libraries and executable using build.sh.
- Run with one of the executables ./cactus_vlm, ./cactus_llm, ./cactus_embed, ./cactus_tts.
- Try different models and make changes as desired.
Contributing
- To contribute a bug fix, create a branch after making your changes with git checkout -b <branch-name> and submit a PR.
- To contribute a feature, please raise as issue first so it can be discussed, to avoid intersecting with someone else.
- Join our discord

Device	Gemma3 1B Q4 (toks/sec)	Qwen3 4B Q4 (toks/sec)
iPhone 16 Pro Max	54	18
iPhone 16 Pro	54	18
iPhone 16	49	16
iPhone 15 Pro Max	45	15
iPhone 15 Pro	45	15
iPhone 14 Pro Max	44	14
OnePlus 13 5G	43	14
Samsung Galaxy S24 Ultra	42	14
iPhone 15	42	14
OnePlus Open	38	13
Samsung Galaxy S23 5G	37	12
Samsung Galaxy S24	36	12
iPhone 13 Pro	35	11
OnePlus 12	35	11
Galaxy S25 Ultra	29	9
OnePlus 11	26	8
iPhone 13 mini	25	8
Redmi K70 Ultra	24	8
Xiaomi 13	24	8
Samsung Galaxy S24+	22	7
Samsung Galaxy Z Fold 4	22	7
Xiaomi Poco F6 5G	22	6

We provide a colleaction of recommended models on our HuggingFace Page

Name		Name	Last commit message	Last commit date
Latest commit History 280 Commits
.github/workflows		.github/workflows
assets		assets
cpp		cpp
docs		docs
flutter		flutter
ios		ios
kotlin		kotlin
react		react
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
Package.swift		Package.swift
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🌍 Translations

About

Uh oh!

Releases 5

Packages

Uh oh!

Contributors 14

Uh oh!

Languages

License

cactus-compute/cactus

Folders and files

Latest commit

History

Repository files navigation

🌍 Translations

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors 14

Uh oh!

Languages

Packages