Skip to content

Commit 376bcde

Browse files
committed
Address comment: add expert_map==None assertion in pplx_prepare_finalize
Signed-off-by: Ming Yang <yming@meta.com>
1 parent 7c57bb0 commit 376bcde

File tree

2 files changed

+4
-2
lines changed

2 files changed

+4
-2
lines changed

vllm/model_executor/layers/fused_moe/deepep_ll_prepare_finalize.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ def max_num_tokens_per_rank(self) -> Optional[int]:
6565
return self.max_tokens_per_rank
6666

6767
def topk_indices_dtype(self) -> Optional[torch.dtype]:
68-
return torch.int64
68+
return torch.int32
6969

7070
def _do_quant(
7171
self,

vllm/model_executor/layers/fused_moe/pplx_prepare_finalize.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,9 @@ def prepare(
100100
hidden_dim = a1.size(-1) # K
101101

102102
assert topk_ids.size(0) == num_tokens
103-
# assert expert_map is None, "NYI"
103+
assert expert_map is None, """with expert map, -1 id is used for
104+
non-local token; this causes error when casting ids to the
105+
topk_indices_dtype() uint32"""
104106

105107
# Is this always going to be a1.device?
106108
device = a1.device

0 commit comments

Comments
 (0)