Maybe the throughput between TGI and vLLM should be updated #478
Closed
zhaoyang-star
announced in
General
Replies: 1 comment
-
Hi! We are testing the performance of TGI and will update the performance results afterward. You can track the progress at #381 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I just notice that TGI has support Paged Attention by integrating vLLM in TGI. The PR#516 has merged and be available in TGI v0.9.2.
Beta Was this translation helpful? Give feedback.
All reactions