fix(smolagents): Getting Multiple Traces While Streaming Code Agent #1872

ialisaleh · 2025-07-16T17:18:15Z

This PR fixes an issue where CodeAgent.run(stream=True) in smolagents would generate separate traces for each step instead of grouping them under a single parent trace. The problem stemmed from Python generators not preserving OpenTelemetry context across yield statements. To address this, a new wrapped_generator() was introduced that ensures the tracing context is reattached during streaming and the final output is collected for proper span annotation. This change preserves existing behavior for non-streaming runs and ensures consistent, complete tracing data regardless of the stream flag.

Closes #1322

...ference-instrumentation-smolagents/src/openinference/instrumentation/smolagents/_wrappers.py

cursor

Bug: Telemetry Span Status Overwritten Incorrectly

The finally blocks in both streaming and non-streaming execution paths unconditionally set the OpenTelemetry span status to OK. This overwrites any ERROR status previously set by an except block when an exception occurs, causing failed operations to be incorrectly reported as successful in telemetry.

python/instrumentation/openinference-instrumentation-smolagents/src/openinference/instrumentation/smolagents/_wrappers.py#L152-L222

openinference/python/instrumentation/openinference-instrumentation-smolagents/src/openinference/instrumentation/smolagents/_wrappers.py

Lines 152 to 222 in 8848820

    
               def wrapped_generator() -> Generator[str, None, None]: 
        
                   try: 
        
                       # Collect chunks for final output 
        
                       for chunk in agent_output: 
        
                           output_chunks.append(str(chunk)) 
        
                           yield chunk 
        
                   except Exception as e: 
        
                       span.record_exception(e) 
        
                       span.set_status(trace_api.StatusCode.ERROR) 
        
                       raise 
        
                   finally: 
        
                       # Set output value from the last observation 
        
                       steps = getattr(agent.monitor, "steps", []) 
        
                       history = getattr(agent.monitor, "history", []) 
        
                       if steps: 
        
                           observation = getattr(steps[-1], "observations", None) 
        
                           if observation: 
        
                               span.set_attribute(OUTPUT_VALUE, observation) 
        
                       elif history: 
        
                           observation = getattr(history[-1], "observations", None) 
        
                           if observation: 
        
                               span.set_attribute(OUTPUT_VALUE, observation) 
        
                       elif output_chunks: 
        
                           span.set_attribute(OUTPUT_VALUE, "".join(output_chunks)) 
        
                       # Record token usage metadata 
        
                       span.set_attribute( 
        
                           LLM_TOKEN_COUNT_PROMPT, agent.monitor.total_input_token_count 
        
                       ) 
        
                       span.set_attribute( 
        
                           LLM_TOKEN_COUNT_COMPLETION, agent.monitor.total_output_token_count 
        
                       ) 
        
                       span.set_attribute( 
        
                           LLM_TOKEN_COUNT_TOTAL, 
        
                           agent.monitor.total_input_token_count 
        
                           + agent.monitor.total_output_token_count, 
        
                       ) 
        
                       span.set_status(trace_api.StatusCode.OK) 
        
                       span.end() 
        
                       context_api.detach(token) 
        
               return wrapped_generator() 
        
           # Handle non-streaming (normal) run 
        
           else: 
        
               try: 
        
                   # Set output value from the agent output 
        
                   span.set_attribute(OUTPUT_VALUE, str(agent_output)) 
        
                   # Record token usage metadata 
        
                   span.set_attribute(LLM_TOKEN_COUNT_PROMPT, agent.monitor.total_input_token_count) 
        
                   span.set_attribute( 
        
                       LLM_TOKEN_COUNT_COMPLETION, agent.monitor.total_output_token_count 
        
                   ) 
        
                   span.set_attribute( 
        
                       LLM_TOKEN_COUNT_TOTAL, 
        
                       agent.monitor.total_input_token_count + agent.monitor.total_output_token_count, 
        
                   ) 
        
                   return agent_output 
        
               except Exception as e: 
        
                   span.record_exception(e) 
        
                   span.set_status(trace_api.StatusCode.ERROR) 
        
                   raise 
        
               finally: 
        
                   span.set_status(trace_api.StatusCode.OK) 
        
                   span.end() 
        
                   context_api.detach(token)

Fix in Cursor • Fix in Web

Bug: Unhandled Exceptions Cause Resource Leaks

A resource leak occurs if the wrapped function raises an exception. The context_api token is attached and a span is started, but the initial exception handler records the error and re-raises without detaching the context token or ending the span, leading to leaked resources.

python/instrumentation/openinference-instrumentation-smolagents/src/openinference/instrumentation/smolagents/_wrappers.py#L139-L145

openinference/python/instrumentation/openinference-instrumentation-smolagents/src/openinference/instrumentation/smolagents/_wrappers.py

Lines 139 to 145 in 8848820

    
           try: 
        
               agent_output = wrapped(*args, **kwargs) 
        
           except Exception as e: 
        
               span.record_exception(e) 
        
               span.set_status(trace_api.StatusCode.ERROR) 
        
               raise

Fix in Cursor • Fix in Web

Bug: Error Status Overwritten in Finally Block

In the streaming generator's finally block, the span status is unconditionally set to OK. This overwrites any ERROR status previously set by the except block when an exception occurs during generator iteration, leading to a loss of error information on the span.

python/instrumentation/openinference-instrumentation-smolagents/src/openinference/instrumentation/smolagents/_wrappers.py#L191-L194

openinference/python/instrumentation/openinference-instrumentation-smolagents/src/openinference/instrumentation/smolagents/_wrappers.py

Lines 191 to 194 in 8848820

    
           span.set_status(trace_api.StatusCode.OK) 
        
           span.end() 
        
           context_api.detach(token)

Fix in Cursor • Fix in Web

Was this report helpful? Give feedback by reacting with 👍 or 👎

axiomofjoy · 2025-07-23T21:09:43Z

...strumentation-smolagents/tests/openinference/instrumentation/smolagents/test_instrumentor.py

+        # Handle streaming (generator) run
+        report_chunks = manager_code_agent.run("Fake question.", stream=True)
+        final_result = "".join([chunk for chunk in report_chunks])
+        assert final_result == "Final report."
+
+        # Handle streaming (generator) run
+        report_chunks = manager_toolcalling_agent.run("Fake question.", stream=True)
+        final_result = "".join([chunk for chunk in report_chunks])
+        assert final_result == "Final report."
+


This test looks like it is being xfailed and doesn't seem to test the emitted traces at all. Can we nix altogether and add a more minimal test for the generator?

fix(smolagents): Getting Multiple Traces While Streaming Code Agent

b64d2c4

ialisaleh requested a review from a team as a code owner July 16, 2025 17:18

github-project-automation bot added this to Instrumentation Jul 16, 2025

dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Jul 16, 2025

This comment was marked as outdated.

Sign in to view

axiomofjoy reviewed Jul 17, 2025

View reviewed changes

...ference-instrumentation-smolagents/src/openinference/instrumentation/smolagents/_wrappers.py Show resolved Hide resolved

...ference-instrumentation-smolagents/src/openinference/instrumentation/smolagents/_wrappers.py Show resolved Hide resolved

fix(smolagents): Refactor _RunWrapper To Improve Readability

f2fcf82

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Jul 21, 2025

This comment was marked as outdated.

Sign in to view

fix(smolagents): Add Testcase For Streaming Code Agent

8848820

cursor bot reviewed Jul 21, 2025

View reviewed changes

axiomofjoy reviewed Jul 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(smolagents): Getting Multiple Traces While Streaming Code Agent #1872

fix(smolagents): Getting Multiple Traces While Streaming Code Agent #1872

ialisaleh commented Jul 16, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

cursor bot left a comment

Uh oh!

axiomofjoy Jul 23, 2025

Uh oh!

Uh oh!


	def wrapped_generator() -> Generator[str, None, None]:
	try:
	# Collect chunks for final output
	for chunk in agent_output:
	output_chunks.append(str(chunk))
	yield chunk
	except Exception as e:
	span.record_exception(e)
	span.set_status(trace_api.StatusCode.ERROR)
	raise
	finally:
	# Set output value from the last observation
	steps = getattr(agent.monitor, "steps", [])
	history = getattr(agent.monitor, "history", [])

	if steps:
	observation = getattr(steps[-1], "observations", None)
	if observation:
	span.set_attribute(OUTPUT_VALUE, observation)
	elif history:
	observation = getattr(history[-1], "observations", None)
	if observation:
	span.set_attribute(OUTPUT_VALUE, observation)
	elif output_chunks:
	span.set_attribute(OUTPUT_VALUE, "".join(output_chunks))

	# Record token usage metadata
	span.set_attribute(
	LLM_TOKEN_COUNT_PROMPT, agent.monitor.total_input_token_count
	)
	span.set_attribute(
	LLM_TOKEN_COUNT_COMPLETION, agent.monitor.total_output_token_count
	)
	span.set_attribute(
	LLM_TOKEN_COUNT_TOTAL,
	agent.monitor.total_input_token_count
	+ agent.monitor.total_output_token_count,
	)

	span.set_status(trace_api.StatusCode.OK)
	span.end()
	context_api.detach(token)

	return wrapped_generator()

	# Handle non-streaming (normal) run
	else:
	try:
	# Set output value from the agent output
	span.set_attribute(OUTPUT_VALUE, str(agent_output))
	# Record token usage metadata
	span.set_attribute(LLM_TOKEN_COUNT_PROMPT, agent.monitor.total_input_token_count)
	span.set_attribute(
	LLM_TOKEN_COUNT_COMPLETION, agent.monitor.total_output_token_count
	)
	span.set_attribute(
	LLM_TOKEN_COUNT_TOTAL,
	agent.monitor.total_input_token_count + agent.monitor.total_output_token_count,
	)
	return agent_output

	except Exception as e:
	span.record_exception(e)
	span.set_status(trace_api.StatusCode.ERROR)
	raise

	finally:
	span.set_status(trace_api.StatusCode.OK)
	span.end()
	context_api.detach(token)


	try:
	agent_output = wrapped(args, *kwargs)
	except Exception as e:
	span.record_exception(e)
	span.set_status(trace_api.StatusCode.ERROR)
	raise

fix(smolagents): Getting Multiple Traces While Streaming Code Agent #1872

Are you sure you want to change the base?

fix(smolagents): Getting Multiple Traces While Streaming Code Agent #1872

Conversation

ialisaleh commented Jul 16, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Bug: Telemetry Span Status Overwritten Incorrectly

Bug: Unhandled Exceptions Cause Resource Leaks

Bug: Error Status Overwritten in Finally Block

Uh oh!

axiomofjoy Jul 23, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!