Have you compared with FasterTransformer #264

linbojin · 2023-06-26T09:59:12Z

linbojin
Jun 26, 2023

Have you compared with https://github.com/NVIDIA/FasterTransformer and NVIDIA/FasterTransformer#506 ?

zhuohan123 · 2023-06-26T18:26:25Z

zhuohan123
Jun 26, 2023
Maintainer

Thanks for the question. Yes we have compared the performance with FasterTransformer in our research paper (will be released soon). We can achieve up to a up to 22x speedup compared to FasterTransformer. The main gain comes from the PagedAttention and continuous batching implemented in vLLM.

5 replies

deepindeed2022 Jun 27, 2023

Looking forward to the early publication of your paper.
ps: Is it the end2end general performance or the performance speedup to 22 in a special case?

void-main Jul 5, 2023

Is it a fair compare? We're testing Llama 65B using FasterTransformer with BS=16, the throughput is ~3000 tokens on A800*8, and the MFU is around 10%. If PagedAttention gained 22x speedup, should I believe the throughput is increased 22 times? If so, the MFU is over 100%... That is logically not possible.

zhuohan123 Jul 5, 2023
Maintainer

The 22x speedup definitely does not hold in all cases. However, with both continuous batching and paged attention, vLLM should be able to provide a decent speedup compared to plain FasterTransformers. The exact speedup varies across different workloads.

julian-q Jul 21, 2023

Have you compared per-token latency of FasterTransformer vs vLLM in the single request, batch size = 16, seq len = 2048 scenario? Excited to see how these compare!

TheAthleticCoder Aug 1, 2023

@zhuohan123 Could we have an approximate date when the paper would be released or by when we could read the pre-print version of the same?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Have you compared with FasterTransformer #264

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 5 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Have you compared with FasterTransformer #264

Uh oh!

linbojin Jun 26, 2023

Replies: 1 comment · 5 replies

Uh oh!

zhuohan123 Jun 26, 2023 Maintainer

Uh oh!

deepindeed2022 Jun 27, 2023

Uh oh!

void-main Jul 5, 2023

Uh oh!

zhuohan123 Jul 5, 2023 Maintainer

Uh oh!

julian-q Jul 21, 2023

Uh oh!

TheAthleticCoder Aug 1, 2023

linbojin
Jun 26, 2023

Replies: 1 comment 5 replies

zhuohan123
Jun 26, 2023
Maintainer

zhuohan123 Jul 5, 2023
Maintainer