[Bug]: When using streaming with the OpenAI API, the ID received with each chunk isn't from OpenAI; rather, it's generated by the LiteLLM client. #10280

chaunceyhw · 2025-04-24T16:38:43Z

What happened?

request

response = litellm.completion(
        model="openai/gpt-4o-mini",
        api_key="sk-xxxx",
        stream=True, 
        stream_options={'include_usage': True},
        messages=[{'role': 'user', 'content': "hello"}],
       
    )
    for item in response:
        print(f'response item ===============:{item}')

response

response item ===============:ModelResponseStream(id='chatcmpl-fcd248be-a3b9-45ee-bbd9-1ef2b74d3681', created=1745512243, model='gpt-4o-mini', object='chat.completion.chunk', system_fingerprint='fp_dbaca60df0', choices=[StreamingChoices(finish_reason=None, index=0, delta=Delta(provider_specific_fields=None, refusal=None, content='Hello', role='assistant', function_call=None, tool_calls=None, audio=None), logprobs=None)], provider_specific_fields=None, stream_options={'include_usage': True}, citations=None)
response item ===============:ModelResponseStream(id='chatcmpl-fcd248be-a3b9-45ee-bbd9-1ef2b74d3681', created=1745512243, model='gpt-4o-mini', object='chat.completion.chunk', system_fingerprint='fp_dbaca60df0', choices=[StreamingChoices(finish_reason=None, index=0, delta=Delta(provider_specific_fields=None, refusal=None, content='!', role=None, function_call=None, tool_calls=None, audio=None), logprobs=None)], provider_specific_fields=None, stream_options={'include_usage': True}, citations=None)
response item ===============:ModelResponseStream(id='chatcmpl-fcd248be-a3b9-45ee-bbd9-1ef2b74d3681', created=1745512243, model='gpt-4o-mini', object='chat.completion.chunk', system_fingerprint='fp_dbaca60df0', choices=[StreamingChoices(finish_reason=None, index=0, delta=Delta(provider_specific_fields=None, refusal=None, content=' How', role=None, function_call=None, tool_calls=None, audio=None), logprobs=None)], provider_specific_fields=None, stream_options={'include_usage': True}, citations=None)
response item ===============:ModelResponseStream(id='chatcmpl-fcd248be-a3b9-45ee-bbd9-1ef2b74d3681', created=1745512243, model='gpt-4o-mini', object='chat.completion.chunk', system_fingerprint='fp_dbaca60df0', choices=[StreamingChoices(finish_reason=None, index=0, delta=Delta(provider_specific_fields=None, refusal=None, content=' can', role=None, function_call=None, tool_calls=None, audio=None), logprobs=None)], provider_specific_fields=None, stream_options={'include_usage': True}, citations=None)
response item ===============:ModelResponseStream(id='chatcmpl-fcd248be-a3b9-45ee-bbd9-1ef2b74d3681', created=1745512243, model='gpt-4o-mini', object='chat.completion.chunk', system_fingerprint='fp_dbaca60df0', choices=[StreamingChoices(finish_reason=None, index=0, delta=Delta(provider_specific_fields=None, refusal=None, content=' I', role=None, function_call=None, tool_calls=None, audio=None), logprobs=None)], provider_specific_fields=None, stream_options={'include_usage': True}, citations=None)
response item ===============:ModelResponseStream(id='chatcmpl-fcd248be-a3b9-45ee-bbd9-1ef2b74d3681', created=1745512243, model='gpt-4o-mini', object='chat.completion.chunk', system_fingerprint='fp_dbaca60df0', choices=[StreamingChoices(finish_reason=None, index=0, delta=Delta(provider_specific_fields=None, refusal=None, content=' assist', role=None, function_call=None, tool_calls=None, audio=None), logprobs=None)], provider_specific_fields=None, stream_options={'include_usage': True}, citations=None)
response item ===============:ModelResponseStream(id='chatcmpl-fcd248be-a3b9-45ee-bbd9-1ef2b74d3681', created=1745512243, model='gpt-4o-mini', object='chat.completion.chunk', system_fingerprint='fp_dbaca60df0', choices=[StreamingChoices(finish_reason=None, index=0, delta=Delta(provider_specific_fields=None, refusal=None, content=' you', role=None, function_call=None, tool_calls=None, audio=None), logprobs=None)], provider_specific_fields=None, stream_options={'include_usage': True}, citations=None)
response item ===============:ModelResponseStream(id='chatcmpl-fcd248be-a3b9-45ee-bbd9-1ef2b74d3681', created=1745512243, model='gpt-4o-mini', object='chat.completion.chunk', system_fingerprint='fp_dbaca60df0', choices=[StreamingChoices(finish_reason=None, index=0, delta=Delta(provider_specific_fields=None, refusal=None, content=' today', role=None, function_call=None, tool_calls=None, audio=None), logprobs=None)], provider_specific_fields=None, stream_options={'include_usage': True}, citations=None)
response item ===============:ModelResponseStream(id='chatcmpl-fcd248be-a3b9-45ee-bbd9-1ef2b74d3681', created=1745512243, model='gpt-4o-mini', object='chat.completion.chunk', system_fingerprint='fp_dbaca60df0', choices=[StreamingChoices(finish_reason=None, index=0, delta=Delta(provider_specific_fields=None, refusal=None, content='?', role=None, function_call=None, tool_calls=None, audio=None), logprobs=None)], provider_specific_fields=None, stream_options={'include_usage': True}, citations=None)
response item ===============:ModelResponseStream(id='chatcmpl-fcd248be-a3b9-45ee-bbd9-1ef2b74d3681', created=1745512243, model='gpt-4o-mini', object='chat.completion.chunk', system_fingerprint='fp_dbaca60df0', choices=[StreamingChoices(finish_reason='stop', index=0, delta=Delta(provider_specific_fields=None, content=None, role=None, function_call=None, tool_calls=None, audio=None), logprobs=None)], provider_specific_fields=None, stream_options={'include_usage': True})
response item ===============:ModelResponseStream(id='chatcmpl-fcd248be-a3b9-45ee-bbd9-1ef2b74d3681', created=1745512243, model='gpt-4o-mini', object='chat.completion.chunk', system_fingerprint='fp_dbaca60df0', choices=[StreamingChoices(finish_reason=None, index=0, delta=Delta(provider_specific_fields=None, content=None, role=None, function_call=None, tool_calls=None, audio=None), logprobs=None)], provider_specific_fields=None, stream_options={'include_usage': True})
response item ===============:ModelResponseStream(id='chatcmpl-fcd248be-a3b9-45ee-bbd9-1ef2b74d3681', created=1745512243, model='gpt-4o-mini', object='chat.completion.chunk', system_fingerprint='fp_dbaca60df0', choices=[StreamingChoices(finish_reason=None, index=0, delta=Delta(provider_specific_fields=None, content=None, role=None, function_call=None, tool_calls=None, audio=None), logprobs=None)], provider_specific_fields=None, stream_options={'include_usage': True}, usage=Usage(completion_tokens=10, prompt_tokens=8, total_tokens=18, completion_tokens_details=CompletionTokensDetailsWrapper(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0, text_tokens=None), prompt_tokens_details=PromptTokensDetailsWrapper(audio_tokens=0, cached_tokens=0, text_tokens=None, image_tokens=None)))

Some explanations

The chunk response ID "chatcmpl-fcd248be-a3b9-45ee-bbd9-1ef2b74d3681" is not the OpenAI response. I looked at the code and found that in the file /litellm/litellm_core_utils/streaming_handler, within the method def chunk_creator(self, chunk: Any), the chunk information is not passed when self.model_response_creator() is called. Consequently, the SDK automatically generates an ID and assigns it to self.response_id. This prevents subsequent processes from being able to update this ID.

Relevant log output

Are you a ML Ops Team?

No

What LiteLLM version are you on ?

v1.66.1

Twitter / LinkedIn details

No response

The text was updated successfully, but these errors were encountered:

psydok · 2025-04-28T09:07:57Z

Same problem when using openrouter with streaming.

chaunceyhw added the bug Something isn't working label Apr 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: When using streaming with the OpenAI API, the ID received with each chunk isn't from OpenAI; rather, it's generated by the LiteLLM client. #10280

[Bug]: When using streaming with the OpenAI API, the ID received with each chunk isn't from OpenAI; rather, it's generated by the LiteLLM client. #10280

chaunceyhw commented Apr 24, 2025

psydok commented Apr 28, 2025

[Bug]: When using streaming with the OpenAI API, the ID received with each chunk isn't from OpenAI; rather, it's generated by the LiteLLM client. #10280

[Bug]: When using streaming with the OpenAI API, the ID received with each chunk isn't from OpenAI; rather, it's generated by the LiteLLM client. #10280

Comments

chaunceyhw commented Apr 24, 2025

What happened?

request

response

Some explanations

Relevant log output

Are you a ML Ops Team?

What LiteLLM version are you on ?

Twitter / LinkedIn details

psydok commented Apr 28, 2025