Replies: 2 comments
-
Use -ngl 99 |
Beta Was this translation helpful? Give feedback.
0 replies
-
Thanks. looks like i also have to copy the cudart-llama-bin-win-cuda-12.4-x64 runtime files into the folder so that it would work :) |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, my system has a RTX3070 and CUDA installed properly. i downloaded llama-b5688-bin-win-cuda-12.4-x64 and used llama-cli to load llama-b5688-bin-win-cuda-12.4-x64 model from local. and it looks like it loaded into PC memory instead of GPU memory. anyone know how to get the model load to GPU? thanks.
Beta Was this translation helpful? Give feedback.
All reactions