Quantizing/converting Llama 3.1 on latest master results in an unloadable model. Any thing I'm missing? #9077
-
Looks like it should have been solved by #8676 but hasn't? I built llama.cpp locally on macOS, downloaded the models from HuggingFace (recently) and ran
The end result it the same some mentioned here: #8650 (comment). I'm on commit |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Updating for whomever needs it: The Just build it yourself and run it from the build folder and it should work with the quantized models. |
Beta Was this translation helpful? Give feedback.
Updating for whomever needs it: The
llama-cli
and other binaries in the root folder of the project are not symlinks to the recently built ones. So thellama-cli
will never match with the one you just built after running the build commands.Just build it yourself and run it from the build folder and it should work with the quantized models.