How to support re-compute kv-cache after certain decoded token #6886
jiazhan-msft
announced in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I have a feature in my model which switches model setup after certain decoded token, e.g., when decoded to the n-th token, the model requires re-compute previous kv-cache, what's the possible path to enable this support? Thanks!
Beta Was this translation helpful? Give feedback.
All reactions