You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello , I'm building an Android app with llama.cpp and looking to get the best possible performance. For those who've done this before - what optimizations worked well for you?
I'm interested in all aspects - sampling settings, memory management, GPU usage (especially with mobile GPUs that use shared memory), or really any other tweaks that helped improve the experience.
Are the default settings good enough, or did you find specific configurations that made a noticeable difference?
Would appreciate any tips or lessons learned from running llama.cpp on Android devices!
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Hello , I'm building an Android app with llama.cpp and looking to get the best possible performance. For those who've done this before - what optimizations worked well for you?
I'm interested in all aspects - sampling settings, memory management, GPU usage (especially with mobile GPUs that use shared memory), or really any other tweaks that helped improve the experience.
Are the default settings good enough, or did you find specific configurations that made a noticeable difference?
Would appreciate any tips or lessons learned from running llama.cpp on Android devices!
Beta Was this translation helpful? Give feedback.
All reactions