Skip to content

Navigation Menu

Appearance settings

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

Appearance settings

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 9.1k
Star 53.9k

Code
Issues 1.8k
Pull requests 904
Discussions
Actions
Projects 11
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

v1/offloading: Add worker-side CPU support #21448

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open

orozery wants to merge 2 commits into vllm-project:main

base: main

Choose a base branch

Loading

Loading

from orozery:cpu-offloading-worker

+806 −0

Conversation 4 Commits 2 Checks 4 Files changed 8

v1/offloading: Add worker-side CPU support

3540bd7

Select commit

Loading

Failed to load commit list.

Uh oh!

There was an error while loading. Please reload this page.

Open

v1/offloading: Add worker-side CPU support #21448

v1/offloading: Add worker-side CPU support

3540bd7

Select commit

Loading

Failed to load commit list.

Uh oh!

There was an error while loading. Please reload this page.

Mergify

Summary

DCO

DCO

pre-commit on: pull_request

pre-commit

Lint and Deploy Charts on: pull_request

lint-and-deploy

Mergify / Summary succeeded Aug 4, 2025 in 0s

4 rules match and 15 potential rules

Rule: label-documentation (label)

any of:
- files~=^[^/]+\.md$
- files~=^docs/
- files~=^examples/

✅ Rule: label-ci-build (label)

any of:
- files~=\.buildkite/
- files=CMakeLists.txt
- files=setup.py
- files~=^\.github/
- files~=^cmake/
- files~=^docker/Dockerfile
- files~=^requirements.*\.txt

Rule: label-deepseek (label)

any of:
- files~=^examples/.*deepseek.*\.py
- files~=^tests/.*deepseek.*\.py
- files~=^vllm/entrypoints/openai/tool_parsers/.*deepseek.*\.py
- files~=^vllm/model_executor/models/.*deepseek.*\.py
- files~=^vllm/reasoning/.*deepseek.*\.py
- files~=^vllm/transformers_utils/.*deepseek.*\.py
- title~=(?i)DeepSeek

Rule: label-frontend (label)

files~=^vllm/entrypoints/

Rule: label-llama (label)

any of:
- files~=^examples/.*llama.*\.py
- files~=^tests/.*llama.*\.py
- files~=^vllm/entrypoints/openai/tool_parsers/llama.*\.py
- files~=^vllm/model_executor/models/.*llama.*\.py
- files~=^vllm/transformers_utils/configs/.*llama.*\.py
- title~=(?i)llama

Rule: label-multi-modality (label)

any of:
- files=tests/models/test_vision.py
- files~=^tests/models/multimodal/
- files~=^tests/multimodal/
- files~=^vllm/multimodal/

Rule: label-new-model (label)

all of:
- files=vllm/model_executor/models/registry.py
- files~=^vllm/model_executor/models/

Rule: label-performance (label)

any of:
- files~=^\.buildkite/nightly-benchmarks/
- files~=^benchmarks/
- files~=^tests/benchmarks/
- files~=^vllm/benchmarks/

Rule: label-qwen (label)

any of:
- files~=^examples/.*qwen.*\.py
- files~=^tests/.*qwen.*\.py
- files~=^vllm/model_executor/models/.*qwen.*\.py
- files~=^vllm/reasoning/.*qwen.*\.py
- title~=(?i)Qwen

Rule: label-rocm (label)

any of:
- files=vllm/platforms/rocm.py
- files~=^csrc/rocm/
- files~=^docker/Dockerfile.rocm
- files~=^requirements/rocm.*\.txt
- files~=^tests/kernels/.*_rocm.*\.py
- files~=^vllm/attention/backends/rocm.*\.py
- files~=^vllm/attention/ops/rocm.*\.py
- files~=^vllm/model_executor/layers/fused_moe/rocm.*\.py
- files~=^vllm/v1/attention/backends/mla/rocm.*\.py
- title~=(?i)AMD
- title~=(?i)ROCm

Rule: label-structured-output (label)

any of:
- files=benchmarks/benchmark_serving_structured_output.py
- files=benchmarks/run_structured_output_benchmark.sh
- files=docs/features/structured_outputs.md
- files=examples/offline_inference/structured_outputs.py
- files=examples/online_serving/openai_chat_completion_structured_outputs.py
- files=examples/online_serving/openai_chat_completion_structured_outputs_with_reasoning.py
- files=tests/v1/entrypoints/llm/test_guided_generate.py
- files~=^benchmarks/structured_schemas/
- files~=^tests/v1/structured_output/
- files~=^vllm/v1/structured_output/

Rule: label-speculative-decoding (label)

any of:
- files=vllm/model_executor/models/mlp_speculator.py
- files~=^examples/.*(spec_decode|mlpspeculator|eagle|speculation).*\.py
- files~=^tests/v1/spec_decode/
- files~=^vllm/model_executor/models/.*eagle.*\.py
- files~=^vllm/transformers_utils/configs/(eagle|medusa|mlp_speculator)\.py
- files~=^vllm/v1/spec_decode/

✅ Rule: label-v1 (label)

any of:
- files~=^tests/v1/
- files~=^vllm/v1/

Rule: label-tpu (label)

any of:
- files~=/tpu/
- files~=_tpu
- files~=pallas
- files~=tpu.py
- files~=tpu_

✅ Rule: label-tpu-remove (label)

all of:
- -files~=/tpu/
- -files~=_tpu
- -files~=pallas
- -files~=tpu.py
- -files~=tpu_

Rule: label-tool-calling (label)

any of:
- files=docs/features/tool_calling.md
- files=examples/offline_inference/chat_with_tools.py
- files=examples/online_serving/openai_chat_completion_client_with_tools.py
- files=examples/online_serving/openai_chat_completion_client_with_tools_required.py
- files=examples/online_serving/openai_chat_completion_tool_calls_with_reasoning.py
- files=tests/entrypoints/openai/test_chat_with_tool_reasoning.py
- files~=^examples/tool_chat_*
- files~=^tests/entrypoints/openai/tool_parsers/
- files~=^tests/mistral_tool_use/
- files~=^tests/tool_use/
- files~=^vllm/entrypoints/openai/tool_parsers/

Rule: ping author on conflicts and add 'needs-rebase' label (comment, label)

conflict
-closed

Rule: assign reviewer for tensorizer changes (assign)

files~=^tests/entrypoints/openai/test_tensorizer_entrypoint.py
files~=^tests/tensorizer_loader/
files~=^vllm/model_executor/model_loader/tensorizer.py
files~=^vllm/model_executor/model_loader/tensorizer_loader.py

✅ Rule: remove 'needs-rebase' label when conflict is resolved (label)

-closed
-conflict

Mergify commands and options

More conditions and actions can be found in the documentation.

You can also trigger Mergify actions by commenting on this pull request:

@Mergifyio refresh will re-evaluate the rules
@Mergifyio rebase will rebase this PR on its base branch
@Mergifyio update will merge the base branch into this PR
@Mergifyio backport <destination> will backport this PR on <destination> branch

Additionally, on Mergify dashboard you can:

look at your merge queues
generate the Mergify configuration with the config editor.

Finally, you can contact us on https://mergify.com

View more details on Mergify

Loading

Re-running checks...

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.