Skip to content

Commit c9cdca8

Browse files
mgoinwwl2755-google
authored andcommitted
Fix FA2 fallback for Blackwell V1 (vllm-project#19781)
Signed-off-by: mgoin <mgoin64@gmail.com>
1 parent 20e6f96 commit c9cdca8

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

vllm/platforms/cuda.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -255,7 +255,7 @@ def get_attn_backend_cls(cls, selected_backend, head_size, dtype,
255255
"install FlashInfer for better performance.")
256256
pass
257257
# FlashAttention is the default for SM 8.0+ GPUs
258-
elif cls.has_device_capability(80):
258+
if cls.has_device_capability(80):
259259
logger.info_once("Using Flash Attention backend on V1 engine.")
260260
return ("vllm.v1.attention.backends."
261261
"flash_attn.FlashAttentionBackend")

0 commit comments

Comments
 (0)