FlashAttention 2 implementing PagedAttention #7321
shashank2000
announced in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi - it appears PagedAttention has been implemented in FlashAttention 2.5 recently. Is there a world in which integrating a new model with vllm is potentially easier resulting from this?
Beta Was this translation helpful? Give feedback.
All reactions