[Draft][WIP][Feature]cpu offload connector #1659

lidenghui1110 · 2025-07-07T14:48:10Z

What this PR does / why we need it?

This PR implements cpu offload connector to enable NPU kv cache offload to host DRAM.
This PR depend on vllm changes with starting a metadata-server process. Metadata-server manages cpu_kv_cache and offers rpc functions for the connector to call. It is designed to support shared-kv-cache between DP EngineCore.
Code of metadata-server is on working, we are trying to implement it in vllm-ascend to avoid long-term pull request merge in vllm.

Does this PR introduce any user-facing change?

user enable cpu offload with following params

 --kv-transfer-config \
    '{
    "kv_connector":"CPUOffloadingConnector",
        "kv_connector_module_path": "vllm_ascend.distributed.kv_transfer.cpu_offloading_connector",
        "kv_role":"kv_both", "kv_connector_extra_config": {"swap_in_threshold": 0, "cpu_swap_space_gb": 800}
    }'

How was this patch tested?

vLLM version: v0.9.1
vLLM main: vllm-project/vllm@110df74

github-actions · 2025-07-09T00:54:40Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

cpu offload connector

e79d775

lidenghui1110 changed the title ~~[draft][wip][feature]cpu offload connector~~ [Draft][WIP][Feature]cpu offload connector Jul 7, 2025

github-actions bot added the merge-conflicts label Jul 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Draft][WIP][Feature]cpu offload connector #1659

[Draft][WIP][Feature]cpu offload connector #1659

lidenghui1110 commented Jul 7, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Jul 9, 2025

Uh oh!

Uh oh!

[Draft][WIP][Feature]cpu offload connector #1659

Are you sure you want to change the base?

[Draft][WIP][Feature]cpu offload connector #1659

Conversation

lidenghui1110 commented Jul 7, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Jul 9, 2025

Uh oh!

Uh oh!

lidenghui1110 commented Jul 7, 2025 •

edited by github-actions bot

Loading