Why does vLLM use a custom all-reduce method? #6159

SamKG · 2024-07-05T20:01:25Z

SamKG
Jul 5, 2024

Hello,
Was hoping someone could help shed some light on this -
why does vLLM choose to use a custom all-reduce method? Is there a benefit to doing this over just using the NCCL APIs?

simon-mo · 2024-07-05T20:05:45Z

simon-mo
Jul 5, 2024
Maintainer

See perf result here #2192. In certain cases, the custom topology drastically boosts performance compared to nccl's implementation. vLLM still uses nccl in majority of cases.

3 replies

SamKG Jul 5, 2024
Author

thanks, that helps a ton!!

SamKG Jul 10, 2024
Author

@simon-mo Just wanted to quickly clarify something: I've observed strange scaling in performance for the 1-stage all-reduce kernels (both nccl and for vllm):

For small problem sizes, it actually performs slower than expected (this is the region where 1-stage reduce is being used)

Theoretically, we should expect scalings that look (roughly) like the below:

Is this expected?

SamKG Jul 10, 2024
Author

cc @hanzhi713

cduk · 2024-11-04T13:22:36Z

cduk
Nov 4, 2024

In a set-up where 4 GPUs are connected by PCIe, but each pair of GPUs are connected by NVLink (112 GB/s bi-directional). Is there a way to specify a reduction first on each pairwise bound set of GPUs before reducing across the slower PCIe link?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Why does vLLM use a custom all-reduce method? #6159

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Why does vLLM use a custom all-reduce method? #6159

Uh oh!

SamKG Jul 5, 2024

Replies: 2 comments · 3 replies

Uh oh!

simon-mo Jul 5, 2024 Maintainer

Uh oh!

SamKG Jul 5, 2024 Author

Uh oh!

Uh oh!

SamKG Jul 10, 2024 Author

Uh oh!

SamKG Jul 10, 2024 Author

Uh oh!

cduk Nov 4, 2024

SamKG
Jul 5, 2024

Replies: 2 comments 3 replies

simon-mo
Jul 5, 2024
Maintainer

SamKG Jul 5, 2024
Author

SamKG Jul 10, 2024
Author

SamKG Jul 10, 2024
Author

cduk
Nov 4, 2024