Skip to content

Structured Output fails with text output + Behaviour inconsistency #1590

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
2 tasks done
IngLP opened this issue Apr 25, 2025 · 12 comments
Open
2 tasks done

Structured Output fails with text output + Behaviour inconsistency #1590

IngLP opened this issue Apr 25, 2025 · 12 comments
Assignees

Comments

@IngLP
Copy link

IngLP commented Apr 25, 2025

Initial Checks

Description

Situation:

  • I need LLM text output, to show it to the user, where the LLM reasons and explains.
  • I want a structured output.

This is NOT working in streaming mode, but it does in NON-streaming mode.

See below tests. The first fails. The second passes.

Example Code

from typing import Union

from pydantic import BaseModel
from pydantic_ai import Agent


class CallAgent(BaseModel):
    agent_name: str


agent = Agent(
    model="google_gla:gemini-2.5-flash-preview-04-17",
    output_type=Union[str, CallAgent],
    instructions="Say hello and then transfer the user to 'user_assistant' agent",
)


async def test_output_with_str(): # FAILS
    async with agent.run_stream(user_prompt="Hello") as result:
        async for msg, is_last in result.stream_structured():
            print(msg)
    assert result.get_output() == CallAgent(agent_name="user_assistant")


async def test_output_with_str_no_stream(): # PASSES
    result = await agent.run(user_prompt="Hello")
    assert result.output == CallAgent(agent_name="user_assistant")

Python, Pydantic AI & LLM client version

pydanticai 0.1.4
any LLM
Python 3.12
@IngLP IngLP changed the title Behaviour inconsistency Structured Output fails with text + Behaviour inconsistency Apr 25, 2025
@IngLP IngLP changed the title Structured Output fails with text + Behaviour inconsistency Structured Output fails with text output + Behaviour inconsistency Apr 25, 2025
@IngLP
Copy link
Author

IngLP commented Apr 25, 2025

It seems the fix can be this simple in agent.py, for agent responses including a tool call. unfortunately, this doesn't work if the agent just produces a text message.

Image

@DouweM
Copy link
Contributor

DouweM commented Apr 25, 2025

@IngLP Can you please change output_type=Union[str, CallAgent] to output_type=CallAgent and see if it works as expected? That'll still allow the model to talk to the user before doing the handoff tool call, but will not cause PydanticAI to treat a text response as sufficient to complete the agent run.

@DouweM DouweM self-assigned this Apr 25, 2025
@IngLP
Copy link
Author

IngLP commented Apr 26, 2025

Tried it, it doesn't work. Removing str from output_type PREVENTS the LLM to output text.

@IngLP
Copy link
Author

IngLP commented Apr 26, 2025

Moreover, this also blocks you from using stream_text(), which I need to show reasoning progress to the user.

@DouweM
Copy link
Contributor

DouweM commented Apr 28, 2025

@IngLP Thanks for trying that, you're right that that would not be the desired result...

To help us debug this further, can you please port your code to the new iter based approach described in #1007 (comment) (see the link to the docs there)? As written there, the run_stream approach has some issues and is slated for deprecation. I don't expect iter to immediately solve your issue (although maybe!), but at least we'd be debugging and fixing this in the new approach rather than the old one.

@DouweM
Copy link
Contributor

DouweM commented Apr 28, 2025

@IngLP Also, have you considered making call_agent a tool the model can choose to use (with appropriate prompting pushing it do so), instead of forcing it through the output type? That way the model is free to chat before calling the tool, and PydanticAI won't get confused in determining whether the conversation is over.

@IngLP
Copy link
Author

IngLP commented Apr 28, 2025

@IngLP Also, have you considered making call_agent a tool the model can choose to use (with appropriate prompting pushing it do so), instead of forcing it through the output type? That way the model is free to chat before calling the tool, and PydanticAI won't get confused in determining whether the conversation is over.

@DouweM indeed, this is exactly the workaround I have set up now. But this is not elegant at all, since it is a NOT supported approach (see issue #1189 ) and makes you abuse the deps.

@DouweM
Copy link
Contributor

DouweM commented Apr 28, 2025

@IngLP Did you try the new agent.iter approach from https://ai.pydantic.dev/agents/#iterating-over-an-agents-graph?

That works as expected with output_type=Union[str, CallAgent]:

import asyncio
from typing import Union

from pydantic import BaseModel
from pydantic_ai import Agent
from pydantic_ai.messages import (
    PartDeltaEvent,
    PartStartEvent,
    TextPart,
    TextPartDelta,
)

class CallAgent(BaseModel):
    agent_name: str


agent = Agent[None, CallAgent](
    model="google-gla:gemini-2.5-flash-preview-04-17",
    output_type=Union[str, CallAgent], # pyright: ignore
    instructions="Say hello and then transfer the user to 'user_assistant' agent",
)

async def test_with_iter():
    async with agent.iter(user_prompt="Hello") as run:
        async for node in run:
            if Agent.is_model_request_node(node):
                async with node.stream(run.ctx) as request_stream:
                    async for event in request_stream:
                        if isinstance(event, PartStartEvent) and isinstance(event.part, TextPart):
                            print(event.part.content, end="", flush=True)
                        elif isinstance(event, PartDeltaEvent) and isinstance(event.delta, TextPartDelta):
                            print(event.delta.content_delta, end="", flush=True)
        assert run.result.output == CallAgent(agent_name="user_assistant")

asyncio.run(test_with_iter())

@IngLP
Copy link
Author

IngLP commented Apr 29, 2025

This works for this simple case, but it doesn't do all the processing and handling performed by agent.run_stream()

@DouweM
Copy link
Contributor

DouweM commented Apr 29, 2025

@IngLP What specific behavior are you missing? iter is not as convenient as run_stream yet, but it is the direction we're moving into because of issues with run_sync like the one you ran into here.

@IngLP
Copy link
Author

IngLP commented Apr 30, 2025

I mean, I would have to re-implement all the logic from here:

@DouweM
Copy link
Contributor

DouweM commented Apr 30, 2025

@IngLP A good amount of that is already covered by the async with node.stream(run.ctx) as request_stream: statement inside async with agent.iter(user_prompt="Hello") as run:. What exactly are you expecting to have to reimplement? We're planning to add more convenience features around iter, so it'd be useful to know your specific concerns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants