Skip to content

Commit 1650e2b

Browse files
committed
fix: stream all chunks for multiseq
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
1 parent 18b66dc commit 1650e2b

File tree

1 file changed

+4
-5
lines changed

1 file changed

+4
-5
lines changed

vllm/sequence.py

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1529,12 +1529,11 @@ def add_request(request_id: str, engine, params, **kwargs):
15291529
def maybe_assemble_group(
15301530
self, seq_group: SequenceGroup) -> Optional[SequenceGroup]:
15311531

1532-
# in the streaming mode, we will return the assembled sequence
1533-
# for the first remaining sequence, and then return None for the
1534-
# rest of sequences
1532+
# in the streaming mode, we will return the assembled sequence for the
1533+
# last remaining sequence, and return None for the rest of sequences
15351534
if self.streaming:
1536-
first_remaining_id = next(iter(self.to_be_finished))
1537-
if seq_group.request_id == first_remaining_id:
1535+
last_remaining_id = list(self.to_be_finished)[-1]
1536+
if seq_group.request_id == last_remaining_id:
15381537
return self.assembled_seq_group
15391538
return None
15401539

0 commit comments

Comments
 (0)