I am using a single A100-PCIE-40GB GPU, but I am unable to assign model layers to the GPU. #8331
-
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 15 replies
-
Use |
Beta Was this translation helpful? Give feedback.
-
@chenbingweb try this in terminal |
Beta Was this translation helpful? Give feedback.
-
@chenbingweb Here is my test using the model from here: https://huggingface.co/Qwen/Qwen2-7B-Instruct-GGUF/blob/main/qwen2-7b-instruct-q2_k.gguf Using llama.cui from here: https://github.com/dspasyuk/llama.cui It uses the same setting that I mentioned above and the recent release of llama.cpp Screencast.from.2024-07-10.10.04.06.AM.webm |
Beta Was this translation helpful? Give feedback.
@chenbingweb what is your ldd /root/llama.cpp/build/bin/llama-server logs? Also why are doing this in /root folder?