Skip to content

Commit 6f2e58f

Browse files
authored
Revert "[Bugfix] default set cuda_graph_sizes to max_num_seqs for v1 engine" (#20128)
Signed-off-by: Will Eaton <weaton@redhat.com>
1 parent a025eb2 commit 6f2e58f

File tree

1 file changed

+4
-9
lines changed

1 file changed

+4
-9
lines changed

vllm/config.py

Lines changed: 4 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2042,12 +2042,11 @@ class SchedulerConfig:
20422042
NOTE: This will be replaced by speculative config in the future; it is
20432043
present to enable correctness tests until then."""
20442044

2045-
cuda_graph_sizes: list[int] = field(default_factory=list)
2046-
"""Cuda graph capture sizes
2047-
1. if none provided, then default set to [max_num_seqs]
2048-
2. if one value is provided, then the capture list would follow the
2045+
cuda_graph_sizes: list[int] = field(default_factory=lambda: [512])
2046+
"""Cuda graph capture sizes, default is 512.
2047+
1. if one value is provided, then the capture list would follow the
20492048
pattern: [1, 2, 4] + [i for i in range(8, cuda_graph_sizes + 1, 8)]
2050-
3. more than one value (e.g. 1 2 128) is provided, then the capture list
2049+
2. more than one value (e.g. 1 2 128) is provided, then the capture list
20512050
will follow the provided list."""
20522051

20532052
delay_factor: float = 0.0
@@ -2212,10 +2211,6 @@ def __post_init__(self) -> None:
22122211
self.max_num_partial_prefills, self.max_long_partial_prefills,
22132212
self.long_prefill_token_threshold)
22142213

2215-
# If cuda_graph_sizes is not specified, default set to [max_num_seqs].
2216-
if not self.cuda_graph_sizes:
2217-
self.cuda_graph_sizes = [self.max_num_seqs]
2218-
22192214
@model_validator(mode='after')
22202215
def _verify_args(self) -> Self:
22212216
if (self.max_num_batched_tokens < self.max_model_len

0 commit comments

Comments
 (0)