How to run LLAMA 2 70B model using llama.cpp: not working on new build #3015
-
Hi all, Had an M2 running LLAMA 2 70B model successfully using gqa and ggmlv3, but with build 1154, and the new format, I get the following error when trying to run llama.ccp: |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
Yes. It should work.
more log
(ctrl+c aborted) |
Beta Was this translation helpful? Give feedback.
-
The issue is the conversion, not trying to run in. You need to specify |
Beta Was this translation helpful? Give feedback.
The issue is the conversion, not trying to run in.
You need to specify
--gqa 8 --eps 1e-5
for the GGML to GGUF conversion script. (The--gqa
one is what's causing your error but using the wrong eps value will affect the quality of your output.)