-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Llama 4 not working #1994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
My fork project has added some updates of llama4: https://github.com/JamePeng/llama-cpp-python |
Same issue, how to run llama4? |
@kerlion |
image: nvidia/cuda:12.2.0-runtime-ubuntu22.04 |
I compiled it from the source code, passed this error. But I do not know which "chat_format" to use? |
@kerlion |
same error with llama_cpp_python 0.3.8: print_info: file format = GGUF V3 (latest) |
Could you please provide your commit number ? |
llama_model_load: error loading model: error loading model architecture: unknown model architecture: 'llama4'
llama_model_load_from_file_impl: failed to load model
Please update to a newer version of llama.cpp:
https://github.com/ggml-org/llama.cpp/releases/tag/b5074
The text was updated successfully, but these errors were encountered: