Modifications for LangGraph4j Integration

Integrated langgraph4j-android-adapter to enable AgentExecutor for structured AI workflows with tool support (e.g., cat language, weather, RAG search). Key changes:

// Note: choose which model api you will use.
private val TEST_MODE = listOf("openai", "ollama", "local").get(0)

SmolLMManager.kt: Added buildGraph for AgentExecutor initialization and modified response generation to use graph.streamSnapshots.

fun resetGraph(_instance: SmolLM) {
    instanceWithTools = SmolLMInferenceEngine(_instance, ToolSpecifications.toolSpecificationsFrom(DummyTools()))
    val stateGraph = AgentExecutor.builder()
        .chatLanguageModel(instanceWithTools)
        .toolSpecification(DummyTools())
        .build()
    graph = stateGraph.compile(CompileConfig.builder().checkpointSaver(MemorySaver()).build())
}

SmolLMInferenceEngine.kt: Created to wrap SmolLM for LocalLLMInferenceEngine, supporting chat and tool integration.

class SmolLMInferenceEngine(
    private val smolLM: SmolLM,
    toolSpecifications: List<ToolSpecification> = emptyList(),
    toolAdapter: LLMToolAdapter = Llama3_2_ToolAdapter()
) : LocalLLMInferenceEngine(toolSpecifications, toolAdapter) {
    override fun generate(prompt: String): Flow<String> = smolLM.getResponse(prompt)
}

DummyTools.kt: Defined tools for AgentExecutor (e.g., cat_language, weather, alice_search).

class DummyTools {
    @Tool("translate a string into cat language,

 returns string")
    fun cat_language(@P("Original string") text: String): String = text.toList().joinToString(" Miao ")
}

app/build.gradle.kts:
Added dependencies langgraph4j-android-adapter for Langgraph4j, and rag_android for RAG functionality. Updated packaging to exclude conflicting metadata.

plugins {
    ......
    id("io.objectbox")
}
android{
    packaging {
        resources {
            excludes += listOf(
                // for langgraph4j
                "META-INF/INDEX.LIST",
                "META-INF/io.netty.versions.properties",
                // for rag
                "META-INF/DEPENDENCIES",
                "META-INF/DEPENDENCIES.txt",
                "META-INF/LICENSE",
                "META-INF/LICENSE.txt",
                "META-INF/NOTICE",
                "META-INF/NOTICE.txt",
                "/META-INF/{AL2.0,LGPL2.1}"
            )
        }
    }
}
dependencies {
    implementation("org.bsc.langgraph4j:langgraph4j-core:1.5.12")
    implementation("org.bsc.langgraph4j:langgraph4j-langchain4j:1.5.12")
    implementation("org.bsc.langgraph4j:langgraph4j-agent-executor:1.5.12")
    implementation("dev.langchain4j:langchain4j:1.0.0-beta5")
    implementation("dev.langchain4j:langchain4j-open-ai:1.0.0-beta5")
    implementation("dev.langchain4j:langgraph4j-ollama:1.0.0-beta5")
    implementation(project(":langgraph4j-android-adapter"))
    implementation(project(":rag_android"))
}

Troubleshooting

HTTP Communication Issues:

Cleartext Traffic Blocked: Android blocks HTTP (cleartext) traffic by default for apps targeting API 28+. If you encounter java.net.UnknownServiceException: CLEARTEXT communication to <IP> not permitted, configure(or create) the network security policy in

app/src/main/res/xml/network_security_config.xml:

<?xml version="1.0" encoding="utf-8"?>
<network-security-config>
    <domain-config cleartextTrafficPermitted="true">
        <domain includeSubdomains="true">ollama.local</domain>
        <!-- Add other specific IPs as needed -->
    </domain-config>
</network-security-config>

Modify AndroidManifest.xml

    ....
    <uses-permission android:name="android.permission.INTERNET"/>

    <application
        ...
        android:networkSecurityConfig="@xml/network_security_config"
        >
        .....
        .....
 
    </application>

</manifest>

SmolChat - On-Device Inference of SLMs in Android

Installation

GitHub

Download the latest APK from GitHub Releases and transfer it to your Android device.
If your device does not downloading APKs from untrusted sources, search for how to allow downloading APKs from unknown sources for your device.

Obtainium

Obtainium allows users to update/download apps directly from their sources, like GitHub or FDroid.

Download the Obtainium app by choosing your device architecture or 'Download Universal APK'.
From the bottom menu, select '➕Add App'
In the text field labelled 'App source URL *', enter the following URL and click 'Add' besides the text field: https://github.com/shubham0204/SmolChat-Android
SmolChat should now be visible in the 'Apps' screen. You can get notifications about newer releases and download them directly without going to the GitHub repo.

Project Goals

Provide a usable user interface to interact with local SLMs (small language models) locally, on-device
Allow users to add/remove SLMs (GGUF models) and modify their system prompts or inference parameters (temperature, min-p)
Allow users to create specific-downstream tasks quickly and use SLMs to generate responses
Simple, easy to understand, extensible codebase

Setup

Clone the repository with its submodule originating from llama.cpp,

git clone --depth=1 https://github.com/shubham0204/SmolChat-Android
cd SmolChat-Android
git submodule update --init --recursive

Android Studio starts building the project automatically. If not, select Build > Rebuild Project to start a project build.
After a successful project build, connect an Android device to your system. Once connected, the name of the device must be visible in top menu-bar in Android Studio.

Working

The application uses llama.cpp to load and execute GGUF models. As llama.cpp is written in pure C/C++, it is easy to compile on Android-based targets using the NDK.
The smollm module uses a llm_inference.cpp class which interacts with llama.cpp's C-style API to execute the GGUF model and a JNI binding smollm.cpp. Check the C++ source files here. On the Kotlin side, the SmolLM class provides the required methods to interact with the JNI (C++ side) bindings.
The app module contains the application logic and UI code. Whenever a new chat is opened, the app instantiates the SmolLM class and provides it the model file-path which is stored by the LLMModel entity. Next, the app adds messages with role user and system to the chat by retrieving them from the database and using LLMInference::addChatMessage.
For tasks, the messages are not persisted, and we inform to LLMInference by passing _storeChats=false to LLMInference::loadModel.

Technologies

ggerganov/llama.cpp is a pure C/C++ framework to execute machine learning models on multiple execution backends. It provides a primitive C-style API to interact with LLMs converted to the GGUF format native to ggml/llama.cpp. The app uses JNI bindings to interact with a small class smollm. cpp which uses llama.cpp to load and execute GGUF models.
noties/Markwon is a markdown rendering library for Android. The app uses Markwon and Prism4j (for code syntax highlighting) to render Markdown responses from the SLMs.

More On-Device ML Projects

shubham0204/Android-Doc-QA: On-device RAG-based question answering from documents
shubham0204/OnDevice-Face-Recognition-Android: Realtime face recognition with FaceNet, Mediapipe and ObjectBox's vector database
shubham0204/FaceRecognition_With_FaceNet_Android: Realtime face recognition with FaceNet, MLKit
shubham0204/CLIP-Android: On-device CLIP inference in Android (search images with textual queries)
shubham0204/Segment-Anything-Android: Execute Meta's SAM model in Android with onnxruntime
shubham0204/Depth-Anything-Android: Execute the Depth-Anything model in Android with onnxruntime for monocular depth estimation
shubham0204/Sentence-Embeddings-Android: Generate sentence-embeddings (from models like all-MiniLM-L6-V2) in Android

Future

The following features/tasks are planned for the future releases of the app:

Assign names to chats automatically (just like ChatGPT and Claude)
Add a search bar to the navigation drawer to search for messages within chats
Add a background service which uses BlueTooth/HTTP/WiFi to communicate with a desktop application to send queries from the desktop to the mobile device for inference
Enable auto-scroll when generating partial response in ChatActivity
Measure RAM consumption
Integrate Android-Doc-QA for on-device RAG-based question answering from documents
Check if llama.cpp can be compiled to use Vulkan for inference on Android devices (and use the mobile GPU)

Name		Name	Last commit message	Last commit date
Latest commit History 110 Commits
.github		.github
app		app
docs		docs
gradle		gradle
hf-model-hub-api		hf-model-hub-api
langgraph4j-android-adapter @ 25490f6		langgraph4j-android-adapter @ 25490f6
llama.cpp @ 5f5e39e		llama.cpp @ 5f5e39e
metadata/en-US		metadata/en-US
rag_android @ 68a783b		rag_android @ 68a783b
resources		resources
smollm		smollm
.clang-format		.clang-format
.gitignore		.gitignore
.gitmodules		.gitmodules
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
build.gradle.kts		build.gradle.kts
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle.kts		settings.gradle.kts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Modifications for LangGraph4j Integration

Troubleshooting

SmolChat - On-Device Inference of SLMs in Android

Installation

GitHub

Obtainium

Project Goals

Setup

Working

Technologies

More On-Device ML Projects

Future

About

Uh oh!

Releases

Packages

Languages

License

smithlai/SmolChat_Langgraph4j_Example

Folders and files

Latest commit

History

Repository files navigation

Modifications for LangGraph4j Integration

Troubleshooting

SmolChat - On-Device Inference of SLMs in Android

Installation

GitHub

Obtainium

Project Goals

Setup

Working

Technologies

More On-Device ML Projects

Future

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages