[perf]Support MOE Multi-stream in Deepseek #947

David9857 · 2025-05-24T10:19:41Z

What this PR does / why we need it?

Support MOE inner Multi-stream for Deepseek.
This feature requires graph mode with mc2 enabled.

Does this PR introduce any user-facing change?

How was this patch tested?

wangxiyuan · 2025-05-29T13:14:15Z

vllm_ascend/quantization/w8a8_dynamic.py

    global_bs = 0
    moe_expert_num = len(expert_map)
    # hidden_states = hidden_states.bfloat16()
-    kwargs = {
+    kwargs1 = {


rename to a readable name

wangxiyuan · 2025-05-29T13:14:45Z

vllm_ascend/envs.py

@@ -36,6 +36,8 @@
    lambda: bool(int(os.getenv("COMPILE_CUSTOM_KERNELS", "1"))),
    "VLLM_ENABLE_MC2":
    lambda: bool(int(os.getenv("VLLM_ENABLE_MC2", '0'))),
+    "VLLM_ENABLE_CV_PARALLEL":


use additional_config instead of env, since this change is only used for torchair GE mode. like #839 does, there are another 3 new config option coming.

how about

{ "additional_config": { "torchair_graph_config": { "enable": True, "enable_cv_parallet": True, "batch_sizes": "12345", "batch_sizes_init": True } } }

cc @zzzzwwjj

wangxiyuan · 2025-05-29T13:28:08Z

And don't forget add e2e test. The model weight is here: https://www.modelscope.cn/models/vllm-ascend/DeepSeek-V2-Lite-W8A8

take https://github.com/vllm-project/vllm-ascend/blob/main/tests/multicard/test_offline_inference_distributed.py#L49 as an example

Signed-off-by: David9857 <985700846@qq.com> use additional_config to enable cv parallel Signed-off-by: David9857 <985700846@qq.com> rename kwargs1 in fused_experts_with_mc2 Signed-off-by: David9857 <985700846@qq.com>

Signed-off-by: David9857 <985700846@qq.com>

github-actions · 2025-06-04T10:32:06Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

github-actions · 2025-06-05T08:36:42Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

wangxiyuan · 2025-06-05T08:37:15Z

vllm_ascend/models/deepseek_v2.py

@@ -179,6 +180,12 @@ def __init__(
        else:
            self.gate.e_score_correction_bias = None

+        self.enable_cv_parallel = False
+        additional_config = get_current_vllm_config().additional_config


please use ascend_config instead now. Note that doc should be updated at the same time.

Signed-off-by: David9857 <985700846@qq.com>

github-actions · 2025-06-05T13:59:25Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: David9857 <985700846@qq.com> bugfix Signed-off-by: David9857 <985700846@qq.com>

Signed-off-by: David9857 <985700846@qq.com>

github-actions bot added module:ops module:quantization labels May 24, 2025

David9857 changed the title ~~[perf][WIP] Support MOE Multi-stream in Deepseek~~ [perf]Support MOE Multi-stream in Deepseek May 26, 2025

David9857 force-pushed the cv branch from 25e3d2c to 78a00c3 Compare May 28, 2025 07:42

github-actions bot added module:core ci/build module:tests module:tools labels May 28, 2025

David9857 force-pushed the cv branch from 88b4098 to 9ddb591 Compare May 29, 2025 03:42

github-actions bot removed ci/build module:tests module:tools labels May 29, 2025

wangxiyuan reviewed May 29, 2025

View reviewed changes

wangxiyuan mentioned this pull request May 29, 2025

feat: support compile torchair graph while warming up #839

Merged

github-actions bot added module:tests and removed module:core labels May 29, 2025

David9857 force-pushed the cv branch from 34df77e to 92d7c24 Compare June 3, 2025 14:15

github-actions bot added documentation Improvements or additions to documentation module:core labels Jun 3, 2025

David9857 force-pushed the cv branch from 92d7c24 to d7f8be5 Compare June 3, 2025 14:17

github-actions bot removed documentation Improvements or additions to documentation module:tests module:core labels Jun 3, 2025

David9857 added 2 commits June 4, 2025 11:46

support moe multistream in deepseek

1074413

Signed-off-by: David9857 <985700846@qq.com> use additional_config to enable cv parallel Signed-off-by: David9857 <985700846@qq.com> rename kwargs1 in fused_experts_with_mc2 Signed-off-by: David9857 <985700846@qq.com>

support cv parallel for float model

3630856

Signed-off-by: David9857 <985700846@qq.com>

David9857 force-pushed the cv branch from 2fc95a6 to 3630856 Compare June 4, 2025 03:47

David9857 added 2 commits June 4, 2025 14:06

Merge remote-tracking branch 'upstream/main' into cv

6b59afe

Merge remote-tracking branch 'upstream/main' into cv

24b5e4d

github-actions bot added the merge-conflicts label Jun 4, 2025

Merge remote-tracking branch 'upstream/main' into cv

4fa61d5

github-actions bot added merge-conflicts and removed merge-conflicts labels Jun 4, 2025

wangxiyuan reviewed Jun 5, 2025

View reviewed changes

refactor in deepseek moe

3511331

Signed-off-by: David9857 <985700846@qq.com>

github-actions bot added module:core and removed merge-conflicts labels Jun 5, 2025

David9857 force-pushed the cv branch from 1cdea6b to ac812df Compare June 5, 2025 13:42

remove cv parallel for float model

415394c

Signed-off-by: David9857 <985700846@qq.com>

David9857 force-pushed the cv branch from ac812df to 415394c Compare June 5, 2025 13:58

github-actions bot added merge-conflicts and removed module:core labels Jun 5, 2025

Merge remote-tracking branch 'upstream/main' into cv

051073a

github-actions bot added module:core and removed merge-conflicts labels Jun 5, 2025

update torchair config

354ff2c

Signed-off-by: David9857 <985700846@qq.com> bugfix Signed-off-by: David9857 <985700846@qq.com>

David9857 force-pushed the cv branch from a1e329d to 354ff2c Compare June 5, 2025 14:11

fix ut for ascend config

4ae80fd

Signed-off-by: David9857 <985700846@qq.com>

github-actions bot added the module:tests label Jun 5, 2025

wangxiyuan approved these changes Jun 5, 2025

View reviewed changes

wangxiyuan merged commit 78431b3 into vllm-project:main Jun 5, 2025
23 checks passed

sdmyzlp mentioned this pull request Jun 9, 2025

Support multistream of shared experts in FusedMoE #997

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[perf]Support MOE Multi-stream in Deepseek #947

[perf]Support MOE Multi-stream in Deepseek #947

Uh oh!

David9857 commented May 24, 2025 •

edited

Loading

Uh oh!

wangxiyuan May 29, 2025

Uh oh!

wangxiyuan May 29, 2025 •

edited

Loading

Uh oh!

wangxiyuan commented May 29, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jun 4, 2025

Uh oh!

github-actions bot commented Jun 5, 2025

Uh oh!

wangxiyuan Jun 5, 2025

Uh oh!

github-actions bot commented Jun 5, 2025

Uh oh!

Uh oh!

Uh oh!

[perf]Support MOE Multi-stream in Deepseek #947

[perf]Support MOE Multi-stream in Deepseek #947

Uh oh!

Conversation

David9857 commented May 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

wangxiyuan May 29, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wangxiyuan commented May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jun 4, 2025

Uh oh!

github-actions bot commented Jun 5, 2025

Uh oh!

wangxiyuan Jun 5, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jun 5, 2025

Uh oh!

Uh oh!

Uh oh!

David9857 commented May 24, 2025 •

edited

Loading

wangxiyuan May 29, 2025 •

edited

Loading

wangxiyuan commented May 29, 2025 •

edited

Loading