Skip to content

Commit 47e4388

Browse files
committed
fix: stream all chunks for multiseq
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
1 parent 879d615 commit 47e4388

File tree

1 file changed

+4
-5
lines changed

1 file changed

+4
-5
lines changed

vllm/sequence.py

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1534,12 +1534,11 @@ def add_request(request_id: str, engine, params, **kwargs):
15341534
def maybe_assemble_group(
15351535
self, seq_group: SequenceGroup) -> Optional[SequenceGroup]:
15361536

1537-
# in the streaming mode, we will return the assembled sequence
1538-
# for the first remaining sequence, and then return None for the
1539-
# rest of sequences
1537+
# in the streaming mode, we will return the assembled sequence for the
1538+
# last remaining sequence, and return None for the rest of sequences
15401539
if self.streaming:
1541-
first_remaining_id = next(iter(self.to_be_finished))
1542-
if seq_group.request_id == first_remaining_id:
1540+
last_remaining_id = list(self.to_be_finished)[-1]
1541+
if seq_group.request_id == last_remaining_id:
15431542
return self.assembled_seq_group
15441543
return None
15451544

0 commit comments

Comments
 (0)