Skip to content

Commit d76c4fb

Browse files
harygo22weijinqian_v1
authored andcommitted
fix a bug
Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>
1 parent d24758e commit d76c4fb

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

vllm_ascend/ascend_forward_context.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,8 @@ def get_fused_moe_state(ep_size: int, with_prefill: bool):
2222
if ep_size == 1:
2323
return FusedMoEState.AllGather
2424
elif envs_ascend.VLLM_ASCEND_ENABLE_MOE_ALL2ALL_SEQ:
25-
return FusedMoEState.All2AllSeq if ep_size < 16 else FusedMoEState.MC2
25+
# MC2 Dispatch/Combine performs better than alltoall_seq in decoding stage.
26+
return FusedMoEState.All2AllSeq if (ep_size < 16 or with_prefill) else FusedMoEState.MC2
2627
# NOTE: mc2 need ep_size >= 16 & all2all can't use in torchair graph.
2728
elif ep_size < 16 or with_prefill:
2829
return FusedMoEState.All2All

0 commit comments

Comments
 (0)