Skip to content

Commit 54cf1ca

Browse files
authored
[Misc] Do not print async output warning for v1 (#21151)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
1 parent 5780121 commit 54cf1ca

File tree

2 files changed

+2
-2
lines changed

2 files changed

+2
-2
lines changed

vllm/platforms/cuda.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ def get_device_total_memory(cls, device_id: int = 0) -> int:
9999

100100
@classmethod
101101
def is_async_output_supported(cls, enforce_eager: Optional[bool]) -> bool:
102-
if enforce_eager:
102+
if enforce_eager and not envs.VLLM_USE_V1:
103103
logger.warning(
104104
"To see benefits of async output processing, enable CUDA "
105105
"graph. Since, enforce-eager is enabled, async output "

vllm/platforms/rocm.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -299,7 +299,7 @@ def get_device_total_memory(cls, device_id: int = 0) -> int:
299299

300300
@classmethod
301301
def is_async_output_supported(cls, enforce_eager: Optional[bool]) -> bool:
302-
if enforce_eager:
302+
if enforce_eager and not envs.VLLM_USE_V1:
303303
logger.warning(
304304
"To see benefits of async output processing, enable CUDA "
305305
"graph. Since, enforce-eager is enabled, async output "

0 commit comments

Comments
 (0)