Integrated langgraph4j-android-adapter
to enable AgentExecutor
for structured AI workflows with tool support (e.g., cat language, weather, RAG search). Key changes:
// Note: choose which model api you will use.
private val TEST_MODE = listOf("openai", "ollama", "local").get(0)
- SmolLMManager.kt: Added
buildGraph
forAgentExecutor
initialization and modified response generation to usegraph.streamSnapshots
.
fun resetGraph(_instance: SmolLM) {
instanceWithTools = SmolLMInferenceEngine(_instance, ToolSpecifications.toolSpecificationsFrom(DummyTools()))
val stateGraph = AgentExecutor.builder()
.chatLanguageModel(instanceWithTools)
.toolSpecification(DummyTools())
.build()
graph = stateGraph.compile(CompileConfig.builder().checkpointSaver(MemorySaver()).build())
}
- SmolLMInferenceEngine.kt: Created to wrap
SmolLM
forLocalLLMInferenceEngine
, supporting chat and tool integration.
class SmolLMInferenceEngine(
private val smolLM: SmolLM,
toolSpecifications: List<ToolSpecification> = emptyList(),
toolAdapter: LLMToolAdapter = Llama3_2_ToolAdapter()
) : LocalLLMInferenceEngine(toolSpecifications, toolAdapter) {
override fun generate(prompt: String): Flow<String> = smolLM.getResponse(prompt)
}
- DummyTools.kt: Defined tools for
AgentExecutor
(e.g.,cat_language
,weather
,alice_search
).
class DummyTools {
@Tool("translate a string into cat language,
returns string")
fun cat_language(@P("Original string") text: String): String = text.toList().joinToString(" Miao ")
}
app/build.gradle.kts:
Added dependencies langgraph4j-android-adapter for Langgraph4j, and rag_android for RAG functionality.
Updated packaging to exclude conflicting metadata.
plugins {
......
id("io.objectbox")
}
android{
packaging {
resources {
excludes += listOf(
// for langgraph4j
"META-INF/INDEX.LIST",
"META-INF/io.netty.versions.properties",
// for rag
"META-INF/DEPENDENCIES",
"META-INF/DEPENDENCIES.txt",
"META-INF/LICENSE",
"META-INF/LICENSE.txt",
"META-INF/NOTICE",
"META-INF/NOTICE.txt",
"/META-INF/{AL2.0,LGPL2.1}"
)
}
}
}
dependencies {
implementation("org.bsc.langgraph4j:langgraph4j-core:1.5.8")
implementation("org.bsc.langgraph4j:langgraph4j-langchain4j:1.5.8")
implementation("org.bsc.langgraph4j:langgraph4j-agent-executor:1.5.8")
implementation("dev.langchain4j:langchain4j:1.0.0-beta3")
implementation("dev.langchain4j:langchain4j-open-ai:1.0.0-beta3")
implementation("dev.langchain4j:langgraph4j-ollama:1.0.0-beta3")
implementation(project(":langgraph4j-android-adapter"))
implementation(project(":rag_android"))
}
HTTP Communication Issues:
Cleartext Traffic Blocked:
Android blocks HTTP (cleartext) traffic by default for apps targeting API 28+.
If you encounter java.net.UnknownServiceException: CLEARTEXT communication to <IP> not permitted
, configure(or create) the network security policy in
app/src/main/res/xml/network_security_config.xml
:
<?xml version="1.0" encoding="utf-8"?>
<network-security-config>
<domain-config cleartextTrafficPermitted="true">
<domain includeSubdomains="true">ollama.local</domain>
<!-- Add other specific IPs as needed -->
</domain-config>
</network-security-config>
Modify AndroidManifest.xml
....
<uses-permission android:name="android.permission.INTERNET"/>
<application
...
android:networkSecurityConfig="@xml/network_security_config"
>
.....
.....
</application>
</manifest>
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
- Download the latest APK from GitHub Releases and transfer it to your Android device.
- If your device does not downloading APKs from untrusted sources, search for how to allow downloading APKs from unknown sources for your device.
Obtainium allows users to update/download apps directly from their sources, like GitHub or FDroid.
- Download the Obtainium app by choosing your device architecture or 'Download Universal APK'.
- From the bottom menu, select '➕Add App'
- In the text field labelled 'App source URL *', enter the following URL and click 'Add' besides the text field:
https://github.com/shubham0204/SmolChat-Android
- SmolChat should now be visible in the 'Apps' screen. You can get notifications about newer releases and download them directly without going to the GitHub repo.
- Provide a usable user interface to interact with local SLMs (small language models) locally, on-device
- Allow users to add/remove SLMs (GGUF models) and modify their system prompts or inference parameters (temperature, min-p)
- Allow users to create specific-downstream tasks quickly and use SLMs to generate responses
- Simple, easy to understand, extensible codebase
- Clone the repository with its submodule originating from llama.cpp,
git clone --depth=1 https://github.com/shubham0204/SmolChat-Android
cd SmolChat-Android
git submodule update --init --recursive
-
Android Studio starts building the project automatically. If not, select Build > Rebuild Project to start a project build.
-
After a successful project build, connect an Android device to your system. Once connected, the name of the device must be visible in top menu-bar in Android Studio.
-
The application uses llama.cpp to load and execute GGUF models. As llama.cpp is written in pure C/C++, it is easy to compile on Android-based targets using the NDK.
-
The
smollm
module uses allm_inference.cpp
class which interacts with llama.cpp's C-style API to execute the GGUF model and a JNI bindingsmollm.cpp
. Check the C++ source files here. On the Kotlin side, theSmolLM
class provides the required methods to interact with the JNI (C++ side) bindings. -
The
app
module contains the application logic and UI code. Whenever a new chat is opened, the app instantiates theSmolLM
class and provides it the model file-path which is stored by theLLMModel
entity. Next, the app adds messages with roleuser
andsystem
to the chat by retrieving them from the database and usingLLMInference::addChatMessage
. -
For tasks, the messages are not persisted, and we inform to
LLMInference
by passing_storeChats=false
toLLMInference::loadModel
.
-
ggerganov/llama.cpp is a pure C/C++ framework to execute machine learning models on multiple execution backends. It provides a primitive C-style API to interact with LLMs converted to the GGUF format native to ggml/llama.cpp. The app uses JNI bindings to interact with a small class
smollm. cpp
which uses llama.cpp to load and execute GGUF models. -
noties/Markwon is a markdown rendering library for Android. The app uses Markwon and Prism4j (for code syntax highlighting) to render Markdown responses from the SLMs.
- shubham0204/Android-Doc-QA: On-device RAG-based question answering from documents
- shubham0204/OnDevice-Face-Recognition-Android: Realtime face recognition with FaceNet, Mediapipe and ObjectBox's vector database
- shubham0204/FaceRecognition_With_FaceNet_Android: Realtime face recognition with FaceNet, MLKit
- shubham0204/CLIP-Android: On-device CLIP inference in Android (search images with textual queries)
- shubham0204/Segment-Anything-Android: Execute Meta's SAM model in Android with onnxruntime
- shubham0204/Depth-Anything-Android: Execute the Depth-Anything model in Android with onnxruntime for monocular depth estimation
- shubham0204/Sentence-Embeddings-Android: Generate
sentence-embeddings (from models like
all-MiniLM-L6-V2
) in Android
The following features/tasks are planned for the future releases of the app:
- Assign names to chats automatically (just like ChatGPT and Claude)
- Add a search bar to the navigation drawer to search for messages within chats
- Add a background service which uses BlueTooth/HTTP/WiFi to communicate with a desktop application to send queries from the desktop to the mobile device for inference
- Enable auto-scroll when generating partial response in
ChatActivity
- Measure RAM consumption
- Integrate Android-Doc-QA for on-device RAG-based question answering from documents
- Check if llama.cpp can be compiled to use Vulkan for inference on Android devices (and use the mobile GPU)