-
I have an RTX 2080 Ti 11GB and TESLA P40 24GB in my machine. First of all, when I try to compile llama.cpp I am asked to set UPDATE: it only "seems to load" if the values of |
Beta Was this translation helpful? Give feedback.
Replies: 19 comments 20 replies
-
If I load Meta-Llama-3-8B-Instruct.f16.gguf it works fine and seems to use both GPUs. But trying to load Meta-Llama-3-70B-Instruct.f16.gguf results in this crash:
|
Beta Was this translation helpful? Give feedback.
-
Hmmm, the |
Beta Was this translation helpful? Give feedback.
-
Oh, I am certainly not using docker either and I assumed that the variable was just badly misnamed and has no connection with docker :) So, if I don't set CUDA_DOCKER_ARCH variable then I get this error:
Here is my nvcc version:
|
Beta Was this translation helpful? Give feedback.
-
Oh dear, the version of nvidia-cuda-toolkit in Ubuntu 22.04 is 11.5. Does this mean I have to install one manually instead of using what comes with Ubuntu 22.04? |
Beta Was this translation helpful? Give feedback.
-
Oh, no, that means removing the nvidia kernel driver and installing |
Beta Was this translation helpful? Give feedback.
-
I tried to install manually cuda 12.5.1 but it failed. Of course I removed nvidia-driver-525 and disabled noveau driver and rebooted, but it still failed due to compiler problem So I will now try your suggestion instead. Thank you. |
Beta Was this translation helpful? Give feedback.
-
Oh no, even after installing nvidia-driver-545 and doing
what can I do? Investigate why manual installation of cuda-12.5.1 fails with that compiler issue? Or is there a better way? I had no idea that upgrading CUDA Toolkit on a relatively recent OS (Ubuntu 22.04.4) is such a nightmare... |
Beta Was this translation helpful? Give feedback.
-
But the strange thing is that on that machine the kernel is stuck at the version 6.5.0-17 for some reason, but on all my other Ubuntu 22.04.4 machines the kernel is version 6.5.0-41. But still even on the other machines the nvidia-cuda-toolkit goes only up to 11.5 my lsb_relase -a is:
|
Beta Was this translation helpful? Give feedback.
-
Maybe I should try installing nvidia-driver-555 manually and then install cuda-12.5.1 with the driver option disabled? Ah, no the driver version only goes up to 550, but cuda 12.5.1 requires 555. |
Beta Was this translation helpful? Give feedback.
-
try using the official installation guide from Nvidia: https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=22.04&target_type=deb_local you have to purge first: sudo apt autoremove nvidia* --purge |
Beta Was this translation helpful? Give feedback.
-
Ok, done all that successfully, now I have 555 driver:
All other installation steps completed as well. However, I do not have nvcc at all. And if I type nvcc I get a suggestion to install nvidia-cuda-toolkit, but it is still the wrong version:
|
Beta Was this translation helpful? Give feedback.
-
Did you use the Nvidia official setup? wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Oh dear, the problem is now worse, the P40 GPU is gone! The 555 driver does not see it!
|
Beta Was this translation helpful? Give feedback.
-
Here is what I see in dmesg:
|
Beta Was this translation helpful? Give feedback.
-
Yes! Bravo, Denis!
|
Beta Was this translation helpful? Give feedback.
-
Hmmm, at the moment I am not happy with the performance of 70B F16 original. I think without P40 (just split between RTX 2080 Ti 11GB and CPU 128GB) it was much faster. Also, killing llama-server process via ^C leaves a zombie process wasting 100% of one cpu:
But I will try smaller quantised versions like Q8 or Q6. |
Beta Was this translation helpful? Give feedback.
-
Oh dear, using 8B F16 Llama3 I got this error:
I hope this is just a one off and not a broken P40 card. Is there a way to properly run some testing diagnostics on it somehow? |
Beta Was this translation helpful? Give feedback.
-
Hmmm, yes, the P40 GPU is broken, returning back to eBay seller. I consistently reproduce this problem under high load, using Blender (via OptiX), it causes these errors in the log and
But all your help was certainly NOT in vain. First of all, I am going to get a P40 GPU replacement most likely. And secondly, even with just RTX 2080 Ti it is much nicer to use CUDA 12.5 than 11.5. Thank you again! |
Beta Was this translation helpful? Give feedback.
@tigran123 Open Additional Driver setting dialog, should look like this and install any non open driver above 525.