Skip to content

[feat(json_adapter): Enhance parsing for single-field outputs & Pydantic v1/v2 compat #8239

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

estsauver
Copy link
Contributor

[feat(json_adapter): Enhance parsing for single-field outputs & Pydantic v1/v2 compat

This commit introduces several improvements to the JSONAdapter:

  1. Improved Parsing for Single Output Fields:
    The parse() method now correctly handles scenarios where a Language Model
    (LM) returns a raw JSON array when the DSPy Signature expects only a single output field of a list type.
    Previously, this could lead to parsing errors if the signature didn't wrap the
    single output in a JSON object.

  2. Pydantic v1/v2 Compatibility for Structured Outputs:
    The _get_structured_outputs_response_format() method has been updated
    to dynamically use Pydantic's ConfigDict for v2.x and the older
    BaseConfig for v1.x. This ensures that the structured output model
    generation (used with OpenAI's function calling/tool use features)
    is compatible across different Pydantic versions by correctly setting
    extra="forbid".

Earl St Sauver added 2 commits May 18, 2025 18:22
…ic v1/v2 compatibility

This commit introduces several improvements to the JSONAdapter:

1.  **Improved Parsing for Single Output Fields:**
    The `parse()` method now correctly handles scenarios where a Language Model
    (LM) returns a raw JSON array or a primitive type (e.g., string, number,
    boolean) when the DSPy Signature expects only a single output field.
    Previously, this could lead to parsing errors if the LM didn't wrap the
    single output in a JSON object. This change improves compatibility with
    models like Gemini, which may exhibit such behavior.

2.  **Pydantic v1/v2 Compatibility for Structured Outputs:**
    The `_get_structured_outputs_response_format()` method has been updated
    to dynamically use Pydantic's `ConfigDict` for v2.x and the older
    `BaseConfig` for v1.x. This ensures that the structured output model
    generation (used with OpenAI's function calling/tool use features)
    is compatible across different Pydantic versions by correctly setting
    `extra="forbid"`.

3.  **Enhanced Error Logging:**
    Improved error logging in the `__call__` method by adding
    `traceback.format_exc(e)` when the initial attempt to use structured
    outputs fails. This provides more detailed diagnostic information for
    debugging.
- dspy/adapters/json_adapter.py: remove `logger.debug(traceback.format_exc(e))` to avoid issue with test lacking stack trace.
- dspy/adapters/json_adapter.py: tighten the conditional to `if isinstance(fields, list) and len(signature.output_fields) == 1` so only list results get wrapped under a single output field, preventing improper handling of non-list completions
@estsauver estsauver changed the title List root types [feat(json_adapter): Enhance parsing for single-field outputs & Pydantic v1/v2 compat May 18, 2025
@okhat okhat requested a review from TomeHirata May 19, 2025 02:13
lm_response=completion,
message="LM response cannot be serialized to a JSON object.",
)
if isinstance(fields, list) and len(signature.output_fields) == 1:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this only applicable if the (single) output field has type list ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or are you saying basically let's just rely on the validation below to check anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, maybe it would be better to run the check on output fields directly instead of on the "fields" variable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants