Skip to content

Commit 2e8cbb5

Browse files
authored
[BugFix] Fix full cuda graph slot_mapping (#21228)
Signed-off-by: fhl2000 <63384265+fhl2000@users.noreply.github.com>
1 parent 752c6ad commit 2e8cbb5

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

vllm/v1/worker/gpu_model_runner.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2079,7 +2079,7 @@ def _dummy_run(
20792079
block_table_tensor=self.input_batch.block_table[
20802080
kv_cache_group_id].get_device_tensor()[:num_reqs],
20812081
slot_mapping=self.input_batch.
2082-
block_table[kv_cache_group_id].slot_mapping[:num_reqs])
2082+
block_table[kv_cache_group_id].slot_mapping[:num_tokens])
20832083

20842084
attn_metadata_i = self.attn_metadata_builders[
20852085
kv_cache_group_id].build_for_cudagraph_capture(

0 commit comments

Comments
 (0)