KV cache disk offload #13346
Unanswered
ha-seungwon
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I'm trying to run
llama.cpp
on a small machine, but the KV cache is too large. Instead of pre-allocating the KV cache in memory as a buffer, is there a way to offload it to disk?Beta Was this translation helpful? Give feedback.
All reactions