Skip to content

Commit 8aeaa91

Browse files
luccafongLucia (Lu) Fang
andauthored
Fix unknown attribute of topk_indices_dtype in CompressedTensorsW8A8Fp8MoECutlassMethod (#20507)
Co-authored-by: Lucia (Lu) Fang <fanglu@meta.com>
1 parent 906e05d commit 8aeaa91

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

vllm/model_executor/layers/quantization/compressed_tensors/compressed_tensors_moe.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -368,6 +368,7 @@ def __init__(
368368
"weights")
369369
self.input_quant = self.quant_config.target_scheme_map["Linear"].get(
370370
"input_activations")
371+
self.topk_indices_dtype = None
371372

372373
per_tensor = (self.weight_quant.strategy == QuantizationStrategy.TENSOR
373374
and self.input_quant.strategy
@@ -738,6 +739,7 @@ def __init__(
738739

739740
from vllm.model_executor.layers.fused_moe.cutlass_moe import (
740741
cutlass_moe_fp8)
742+
self.topk_indices_dtype = None
741743
self.fused_experts = cutlass_moe_fp8 # type: ignore
742744
self.disable_expert_map = False
743745

0 commit comments

Comments
 (0)