Skip to content

Commit 467bef1

Browse files
authored
[BugFix][FlashInfer] Fix attention backend interface mismatch with unexpected keyword use_irope (vllm-project#19134)
Signed-off-by: Yunqiu Guo <guorachel@meta.com>
1 parent 5f1ac1e commit 467bef1

File tree

1 file changed

+5
-0
lines changed

1 file changed

+5
-0
lines changed

vllm/v1/attention/backends/flashinfer.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -508,7 +508,12 @@ def __init__(
508508
logits_soft_cap: Optional[float] = None,
509509
attn_type: AttentionType = AttentionType.DECODER,
510510
kv_sharing_target_layer_name: Optional[int] = None,
511+
use_irope: bool = False,
511512
) -> None:
513+
if use_irope:
514+
logger.warning_once(
515+
"Using irope in FlashInfer is not supported yet, it will fall"
516+
" back to global attention for long context.")
512517
self.num_heads = num_heads
513518
self.head_size = head_size
514519
self.scale = float(scale)

0 commit comments

Comments
 (0)