Replies: 2 comments 1 reply
-
I have a notebook with the AMD processor 5700u and it works. With 64 GB RAM i can run 70B models too, speed depends on the quantization method and the model. Sure i have to wait, but i watch movies or chat. Even Stable Diffusion with XL-model runs with my processor. |
Beta Was this translation helpful? Give feedback.
-
With airoboros-l2-70b-2.1 (gguf) and Q5_K quantization: 1260,18 ms per token, but i had other 70B models (ggml) with other quant.., with them i had under 500 ms/token sometimes. But i read about different methods and think, i don't want much accuracy lose. The different methods use different amount of RAM. I must say, i limited the power consumption to 25 watt. At the moment i test: phind-codellama-34b-v2.Q5_K_M.gguf |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I couldn't find anything about running LLMs on that cheap Mini PCs(mostly from China), what speeds could it achieve?(7B, 13B and etc).
Some have Intel processors(like n3350, j4125, n95, n100, n305, 11400h, etc) others AMD (like 4500u, 5500u, 5700u, 5600h, 5800h, etc).
If anyone has one and could benchmark it on CPU and share it, I would appreciate, thanks in advance.
Beta Was this translation helpful? Give feedback.
All reactions