Replies: 9 comments 1 reply
-
Interesting. The multimodality aspect sounds great, but the LLM inference speed doesn’t seem to be significantly different on phones, from what folks report:
|
Beta Was this translation helpful? Give feedback.
-
I think there's still a big difference |
Beta Was this translation helpful? Give feedback.
-
pocketpal 5tokens/s |
Beta Was this translation helpful? Give feedback.
-
MNN beat llama.cpp
|
Beta Was this translation helpful? Give feedback.
-
llama.cpp 易用性要好,希望能提升性能 |
Beta Was this translation helpful? Give feedback.
-
Speed test:PocketPal vs MNN Chat Prompt:"Why the sky is blue?" Results
PocketPal Models: bartowski/Qwen2.5-3B-Instruct-GGUF
Device: Samsung SM-G780G
OS: Android 13
App Version: 1.8.5
8 cores
7.4GB |
Beta Was this translation helpful? Give feedback.
-
MNN vs PocketPal Pixel 6a Unsloth model MNN in 3x faster MNN PocketPal |
Beta Was this translation helpful? Give feedback.
-
New version MNN is even faster that previous https://github.com/alibaba/MNN/blob/master/apps/Android/MnnLlmChat/README.md#version-040 |
Beta Was this translation helpful? Give feedback.
-
Mnn is faster. I don't have statistics. Actually no need. Trust me. But... Everything is not about speed. Ecosystem matter. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
llama.cpp vs mnn-llm speed
https://www.reddit.com/r/LocalLLaMA/s/tDh1l0cvVe
.
https://github.com/alibaba/MNN/blob/master/project/android/apps/MnnLlmApp/README.md
.
https://github.com/alibaba/MNN
Beta Was this translation helpful? Give feedback.
All reactions