Skip to content

Commit f05ee46

Browse files
zzzzwwjjwhx-sjtu
authored andcommitted
fix: fix deepseek accuracy when ep_size=1
Signed-off-by: zzzzwwjj <1183291235@qq.com>
1 parent a990949 commit f05ee46

File tree

2 files changed

+2
-0
lines changed

2 files changed

+2
-0
lines changed

vllm_ascend/ops/fused_moe.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -198,6 +198,7 @@ def fused_experts(
198198
num_experts = w1.shape[0]
199199
dtype = hidden_states.dtype
200200
device = hidden_states.device
201+
topk_weights = topk_weights.to(dtype)
201202
# assert dtype in [torch.float32, torch.float16, torch.bfloat16
202203
# ], "Only float32, float16, and bfloat16 are supported"
203204

vllm_ascend/quantization/w8a8_dynamic.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -342,6 +342,7 @@ def fused_experts(hidden_states: torch.Tensor,
342342
num_experts = w1.shape[0]
343343
dtype = hidden_states.dtype
344344
device = hidden_states.device
345+
topk_weights = topk_weights.to(dtype)
345346

346347
if expert_map is not None:
347348
# Generate token indices and flatten

0 commit comments

Comments
 (0)