[0.9.1][Perf] Launch load kv task asynchronously with thread pool. #1612

ganyi1996ppo · 2025-07-03T07:55:02Z

What this PR does / why we need it?

The Implementation of current LLMDataDistCMgrConnector connect and pull kv in synchronous manager which may brought drop of latency to the decode task if the consumer node consistently receiving tasks pushed from the remote producer node. The omni_infer launches llmdatadist's pull_kv method in another thread which brings the better overlap between pull kv cache and model run. This implementation gains better performance against the synchronous path.

In this PR, we bring this asynchronous philosophy into the vllm-ascend, and launch the link, pull_kv and request_finished tasks also in async managers.

Does this PR introduce any user-facing change?

No any user interface change.

How was this patch tested?

Signed-off-by: ganyi <pleaplusone.gy@gmail.com>

ganyi1996ppo added 3 commits July 3, 2025 15:53

bring asynchronous kv cache pulling phylosophy into vllm-ascend

ea2f5f6

Signed-off-by: ganyi <pleaplusone.gy@gmail.com>

fix lint

82230a8

Signed-off-by: ganyi <pleaplusone.gy@gmail.com>

async pullkv

8c2864c

Signed-off-by: ganyi <pleaplusone.gy@gmail.com>

ganyi1996ppo marked this pull request as ready for review July 5, 2025 12:03

ganyi1996ppo requested a review from wangxiyuan July 5, 2025 12:04

ganyi1996ppo assigned Yikun and unassigned Yikun Jul 5, 2025

ganyi1996ppo requested a review from Yikun July 6, 2025 00:21

fix mypy

77aa4dd

Signed-off-by: ganyi <pleaplusone.gy@gmail.com>

ganyi1996ppo changed the title ~~[Perf] Launch load kv task asynchronizely with thread pool.~~ [Perf] Launch load kv task asynchronously with thread pool. Jul 7, 2025

wangxiyuan changed the title ~~[Perf] Launch load kv task asynchronously with thread pool.~~ [0.9.1][Perf] Launch load kv task asynchronously with thread pool. Jul 7, 2025

ganyi1996ppo merged commit ffd1d9a into vllm-project:v0.9.1-dev Jul 8, 2025
16 checks passed

wangxiyuan added the no-main label Jul 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[0.9.1][Perf] Launch load kv task asynchronously with thread pool. #1612

[0.9.1][Perf] Launch load kv task asynchronously with thread pool. #1612

ganyi1996ppo commented Jul 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

[0.9.1][Perf] Launch load kv task asynchronously with thread pool. #1612

[0.9.1][Perf] Launch load kv task asynchronously with thread pool. #1612

Conversation

ganyi1996ppo commented Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

Uh oh!

Uh oh!

ganyi1996ppo commented Jul 3, 2025 •

edited

Loading