Skip to content

Offloading model layers to the GPU does not reduce the RAM load. Is this normal behavior? #6496

Answered by Folko-Ven
Folko-Ven asked this question in Q&A
Discussion options

You must be logged in to vote

@slaren @phymbert
I conducted testing with another model that fully fit into RAM.
You were right, offloading to the GPU does indeed reduce RAM usage, although not as effectively as I had hoped.
Apparently, the model I wanted to launch did not fit, even considering the offloading to the GPU.
I apologize for wasting your time unnecessarily.

Replies: 2 comments 7 replies

Comment options

You must be logged in to vote
1 reply
@Folko-Ven
Comment options

Comment options

You must be logged in to vote
6 replies
@slaren
Comment options

@Folko-Ven
Comment options

@slaren
Comment options

@Folko-Ven
Comment options

Answer selected by Folko-Ven
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
4 participants