Replies: 1 comment
-
I want to konw also. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
With pytorch and the transformer framework, I can get the kv cache like this
generated = model.generate(input_ids, max_new_tokens = 1, return_dict_in_generate=True)
kv = generated['past_key_values']
How to get the corresponding kv cache under llamacpp?
I really appreciate everyone being able to answer my questions, thanks a million!
Beta Was this translation helpful? Give feedback.
All reactions