Skip to content

[0.9.1][Perf] Launch load kv task asynchronously with thread pool. #1612

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

ganyi1996ppo
Copy link
Collaborator

@ganyi1996ppo ganyi1996ppo commented Jul 3, 2025

What this PR does / why we need it?

The Implementation of current LLMDataDistCMgrConnector connect and pull kv in synchronous manager which may brought drop of latency to the decode task if the consumer node consistently receiving tasks pushed from the remote producer node. The omni_infer launches llmdatadist's pull_kv method in another thread which brings the better overlap between pull kv cache and model run. This implementation gains better performance against the synchronous path.

In this PR, we bring this asynchronous philosophy into the vllm-ascend, and launch the link, pull_kv and request_finished tasks also in async managers.

Does this PR introduce any user-facing change?

No any user interface change.

How was this patch tested?

Signed-off-by: ganyi <pleaplusone.gy@gmail.com>
Signed-off-by: ganyi <pleaplusone.gy@gmail.com>
Signed-off-by: ganyi <pleaplusone.gy@gmail.com>
@ganyi1996ppo ganyi1996ppo marked this pull request as ready for review July 5, 2025 12:03
@ganyi1996ppo ganyi1996ppo requested a review from wangxiyuan July 5, 2025 12:04
@ganyi1996ppo ganyi1996ppo assigned Yikun and unassigned Yikun Jul 5, 2025
@ganyi1996ppo ganyi1996ppo requested a review from Yikun July 6, 2025 00:21
Signed-off-by: ganyi <pleaplusone.gy@gmail.com>
@ganyi1996ppo ganyi1996ppo changed the title [Perf] Launch load kv task asynchronizely with thread pool. [Perf] Launch load kv task asynchronously with thread pool. Jul 7, 2025
@wangxiyuan wangxiyuan changed the title [Perf] Launch load kv task asynchronously with thread pool. [0.9.1][Perf] Launch load kv task asynchronously with thread pool. Jul 7, 2025
@ganyi1996ppo ganyi1996ppo merged commit ffd1d9a into vllm-project:v0.9.1-dev Jul 8, 2025
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants