Skip to content

Commit edd270b

Browse files
authored
[Bugfix] Prevent IndexError for cached requests when pipeline parallelism is disabled (#20486)
Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
1 parent 110df74 commit edd270b

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

vllm/v1/core/sched/scheduler.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -635,6 +635,8 @@ def _make_cached_request_data(
635635
token_ids = req.all_token_ids[req.num_computed_tokens:req.
636636
num_computed_tokens + num_tokens]
637637
new_token_ids.append(token_ids)
638+
else:
639+
new_token_ids.append([])
638640
new_block_ids.append(req_to_new_block_ids[req_id])
639641
num_computed_tokens.append(req.num_computed_tokens)
640642
# Because resumed_reqs is usually empty, it is more efficient to do

0 commit comments

Comments
 (0)