Skip to content

Want to run on 1 GPU (of 2 total) but looks like the model is loaded onto both GPUs. #2752

Closed Answered by isaacmorgan
isaacmorgan asked this question in Q&A
Discussion options

You must be logged in to vote

I found the cause for this. It is not a problem with LlamaCPP.

X11 config had BaseMosaic enabled, which caused (I don't fully understand why) the behavior.

https://forums.developer.nvidia.com/t/unwanted-duplicate-threads-processes-on-dual-p6000/155178/3
https://forums.developer.nvidia.com/t/memory-is-allocated-on-all-gpus/183110

Replies: 4 comments 6 replies

Comment options

You must be logged in to vote
2 replies
@isaacmorgan
Comment options

@ianscrivener
Comment options

Comment options

You must be logged in to vote
3 replies
@isaacmorgan
Comment options

@pmelendez
Comment options

@isaacmorgan
Comment options

Comment options

You must be logged in to vote
1 reply
@isaacmorgan
Comment options

Comment options

You must be logged in to vote
0 replies
Answer selected by isaacmorgan
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants