[NIXL] vllm v0 nixl integration #16677

rainj-me · 2025-04-15T18:39:41Z

Features are supported in this PR

vllm v0 nixl integration without etcd/nats as dependencies
http apis to handshake nixl conns
MLA layout support
MTP and speculative decoding support
Remote prefill worker skip sampling to reduce TTFT

Usage scripts

# Decode worker
vllm serve /data/models/QwQ-32B --tensor_parallel-size 2 --port 8080 --max-model-len 131072 --swap-space 0  --block-size 128 --trust-remote-code --kv-transfer-config '{"kv_connector":"DynamoNixlConnector"}' --enable-chunked-prefill false

# Remote prefill worker
vllm serve /data/models/QwQ-32B --tensor_parallel-size 1 --port 8090 --max-model-len 65536 --swap-space 0  --block-size 128 --gpu-memory-utilization 0.95 --trust-remote-code --enforce-eager --kv-transfer-config '{"kv_connector":"DynamoNixlConnector", "kv_connector_extra_config":{"skip_sampling":true}}' --enable-chunked-prefill false

# Establish nixl conn
curl -kvvv -XPOST http://127.0.0.1:8080/add_remote_prefill_eps  -H "Content-Type: application/json" -d '{"endpoints":["http://127.0.0.1:8090"]}'

# Test command
curl http://127.0.0.1:8080/v1/chat/completions -H "Content-Type: application/json" -d '{"model": "/data/models/QwQ-32B", "temperature": 0, "messages": [ {"role": "user", "content": "San Francisco is "}], "max_tokens": 300, "stream": false}'

What are left

chunked-prefill support
PP support
vllm V1 support
Only keep DynamoNixlConnector and cleanup other connectors

github-actions · 2025-04-15T18:39:50Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

mergify · 2025-04-15T18:40:21Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @rainj-me.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

mergify · 2025-04-19T04:53:34Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @rainj-me.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Changqi Lu <luchangqi.123@bytedance.com>

mergify bot added frontend speculative-decoding labels Apr 15, 2025

mergify bot added the needs-rebase label Apr 15, 2025

rainj-me added 6 commits April 15, 2025 23:09

apply patch for dynamo

1daa1c3

support mtp for dynamo remote prefill

2be9bdc

integrate nixl directly to vllm

c8c4faa

support add and remove remote prefill endpoint apis

dac0543

not require for decode to add nixl metadata from prefill

a3c9862

skip sample on remote prefill worker

a7eca8f

rainj-me force-pushed the feat/vllm-nixl branch from 7873584 to a7eca8f Compare April 15, 2025 23:12

mergify bot removed the needs-rebase label Apr 15, 2025

mergify bot added the needs-rebase label Apr 19, 2025

fix synchronize before transfer blocks

7cf1c0e

Signed-off-by: Changqi Lu <luchangqi.123@bytedance.com>

zeroorhero force-pushed the feat/vllm-nixl branch from 12d1d9d to 7cf1c0e Compare April 21, 2025 02:51

rainj-me mentioned this pull request Jun 6, 2025

[RFC][Feature] Support Remote Prefill in PD Disaggregation sgl-project/sglang#6925

Open

2 tasks

njhill added the v0 label Jun 19, 2025

mergify bot added the deepseek Related to DeepSeek models label Jul 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[NIXL] vllm v0 nixl integration #16677

[NIXL] vllm v0 nixl integration #16677

rainj-me commented Apr 15, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Apr 15, 2025

Uh oh!

mergify bot commented Apr 15, 2025

Uh oh!

mergify bot commented Apr 19, 2025

Uh oh!

Uh oh!

Uh oh!

[NIXL] vllm v0 nixl integration #16677

Are you sure you want to change the base?

[NIXL] vllm v0 nixl integration #16677

Conversation

rainj-me commented Apr 15, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Features are supported in this PR

Usage scripts

What are left

Uh oh!

github-actions bot commented Apr 15, 2025

Uh oh!

mergify bot commented Apr 15, 2025

Uh oh!

mergify bot commented Apr 19, 2025

Uh oh!

Uh oh!

rainj-me commented Apr 15, 2025 •

edited by github-actions bot

Loading