Skip to content

Commit d6275e7

Browse files
committed
fix comment
Signed-off-by: Chengji Yao <chengjiyao@google.com>
1 parent fff63b2 commit d6275e7

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

vllm/v1/attention/backends/pallas.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -195,7 +195,10 @@ def forward(
195195
write_to_kv_cache(key, value, kv_cache, slot_mapping,
196196
self.kv_cache_quantized_dtype,
197197
layer._k_scale_float, layer._v_scale_float)
198-
198+
if self.kv_cache_quantized_dtype is not None and (
199+
layer._k_scale_float == 0.0 or layer._v_scale_float == 0.0):
200+
raise ValueError(
201+
"k_scale_float and v_scale_float must be non-zero")
199202
output = torch.ops.xla.ragged_paged_attention(
200203
query,
201204
kv_cache,

0 commit comments

Comments
 (0)