Skip to content

feat: pydanticai instrumentation #1639

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

davidgmonical
Copy link
Collaborator

resolves: #1348

Adds instrumentation for PydanticAI. Captures and converts gen_ai attributes from running an Agent with instrument=True.

Screenshot 2025-05-16 at 12 50 05 PM

@davidgmonical davidgmonical requested a review from a team as a code owner May 16, 2025 20:00
@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label May 16, 2025
@davidgmonical davidgmonical changed the title feat: David/pydantic instrumentation feat: pydanticai instrumentation May 16, 2025
@caroger caroger requested a review from axiomofjoy May 20, 2025 17:42
Copy link
Contributor

@mikeldking mikeldking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome - though I wonder if we should re-think this category of integration. It's not an instrumentor after all.

On the js side we just do openinference-xxxx

Note you are missing :

  • the changes to the base readme
  • release please manifest changes
  • symlinking support for a single interactive install

@@ -0,0 +1 @@
# Migrations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure you need this?

@@ -0,0 +1,3 @@
# OpenInference PydanticAI Instrumentation

Python auto-instrumentation library for PydanticAI.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you flesh this out more? it gets scaffolded into pypi and is important for various reasons like search.

@@ -0,0 +1,45 @@
import os
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: name it according to the use case. it's easier to grep for

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also add a more interesting example?

Comment on lines 20 to 21
# Launch Phoenix app
session = px.launch_app()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably avoid this pattern - it encourages bad practices. Just assume phoenix is running.

build-backend = "hatchling.build"

[project]
name = "openinference-instrumentation-pydanticai"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if this is not actually an instrumentor, it might be worth naming this opeininference-pydanticai rather than instrumentation since it's not an instrumentor

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

though if this just makes the symlinking confusing I think it's okay. Defer to @axiomofjoy

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would also use pydantic-ai everywhere rather than pydanticai to match the package name

Comment on lines 39 to 40
"pydantic>=2.10.0",
"pydantic-ai>=0.2.0",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the pydantic necessary here?

from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
from pydantic import BaseModel
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get a circular import on this line:

Traceback (most recent call last):
  File "/Users/mikeldking/work/openinference/python/instrumentation/openinference-instrumentation-pydanticai/./examples/pydantic.py", line 3, in <module>
    import phoenix as px
  File "/Users/mikeldking/work/phoenix/src/phoenix/__init__.py", line 10, in <module>
    from .session.client import Client
  File "/Users/mikeldking/work/phoenix/src/phoenix/session/client.py", line 31, in <module>
    from phoenix.db.insertion.dataset import DatasetKeys
  File "/Users/mikeldking/work/phoenix/src/phoenix/db/__init__.py", line 1, in <module>
    from .engines import get_printable_db_url
  File "/Users/mikeldking/work/phoenix/src/phoenix/db/engines.py", line 21, in <module>
    from phoenix.db.helpers import SupportedSQLDialect
  File "/Users/mikeldking/work/phoenix/src/phoenix/db/helpers.py", line 23, in <module>
    from phoenix.db import models
  File "/Users/mikeldking/work/phoenix/src/phoenix/db/models.py", line 48, in <module>
    from phoenix.db.types.annotation_configs import (
  File "/Users/mikeldking/work/phoenix/src/phoenix/db/types/annotation_configs.py", line 4, in <module>
    from pydantic import AfterValidator, Field, RootModel, model_validator
  File "/Users/mikeldking/work/openinference/python/instrumentation/openinference-instrumentation-pydanticai/examples/pydantic.py", line 8, in <module>
    from pydantic import BaseModel
ImportError: cannot import name 'BaseModel' from partially initialized module 'pydantic' (most likely due to a circular import) (/Users/mikeldking/work/openinference/python/instrumentation/openinference-instrumentation-pydanticai/examples/pydantic.py)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be fixed once I rename the script

Comment on lines 58 to 59
result = self._base_exporter.force_flush(timeout_millis)
return bool(result)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just return the base force_flush?

FUNCTION = "function"


class GenAICommonAttributes:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there no package that has these experimental attributes?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call out - looks like I can get most of them from the opentelemetry-python package.

There seem to be some left out, specifically the body fields for events (the docs link an issue to migrate them which might be why). I will keep those defined in our own constants but use the imported ones wherever I can

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree here. The more we can leverage types/ enums/ constants from OTel or Pydantic itself, the better.

build-backend = "hatchling.build"

[project]
name = "openinference-instrumentation-pydanticai"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would also use pydantic-ai everywhere rather than pydanticai to match the package name

Comment on lines 62 to 64
assert attributes.get(SpanAttributes.LLM_TOKEN_COUNT_PROMPT) == 58
assert attributes.get(SpanAttributes.LLM_TOKEN_COUNT_COMPLETION) == 22
assert attributes.get(SpanAttributes.LLM_TOKEN_COUNT_TOTAL) == 80
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not hard code these to any particular recording of the cassette so we can re-run and re-record the tests easily, e.g., just check for int and check that the numbers add up.

Comment on lines 1 to 9
from openinference.instrumentation.pydanticai.span_exporter import OpenInferenceSpanExporter
from openinference.instrumentation.pydanticai.utils import is_openinference_span
from openinference.instrumentation.pydanticai.version import __version__

__all__ = [
"OpenInferenceSpanExporter",
"is_openinference_span",
"__version__",
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from openinference.instrumentation.pydanticai.span_exporter import OpenInferenceSpanExporter
from openinference.instrumentation.pydanticai.utils import is_openinference_span
from openinference.instrumentation.pydanticai.version import __version__
__all__ = [
"OpenInferenceSpanExporter",
"is_openinference_span",
"__version__",
]
from openinference.instrumentation.pydanticai.span_exporter import OpenInferenceSpanExporter
from openinference.instrumentation.pydanticai.version import __version__
__all__ = [
"OpenInferenceSpanExporter",
"__version__",
]

This looks like a helper function we are using in tests only rather than something we want to expose to end users?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this was kind of copying the vercel js implementation. I assumed it's because you end up with some spans that don't map to openinference ones (something I saw as well in pydantic). I was thinking it is a convenience function for a user to block those from getting to phoenix, but probably need more context (maybe @Parker-Stafford can chime in) on what that was for in vercel

Comment on lines 11 to 15
def is_openinference_span(span: ReadableSpan) -> bool:
"""Check if a span is an OpenInference span."""
if span.attributes is None:
return False
return SpanAttributes.OPENINFERENCE_SPAN_KIND in span.attributes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like something we only use in tests. Can I be in the test file rather than shipped with the package?

from openinference.instrumentation.pydanticai.utils import SpanFilter, should_export_span


class OpenInferenceSpanExporter(SpanExporter):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why use an exporter rather than a span processor?

Comment on lines 185 to 189
class OpenInferenceAttributesExtractor:
"""Extracts OpenInference attributes from GenAI attributes."""

def __init__(self) -> None:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strikes me that all of the methods on this class are pure functions and the class has no state. Maybe they can just be functions?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah makes sense

FUNCTION = "function"


class GenAICommonAttributes:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree here. The more we can leverage types/ enums/ constants from OTel or Pydantic itself, the better.

@mikeldking
Copy link
Contributor

That trace looks pretty great - I wish we could figure out the tool spans!

Start Phoenix in the background as a collector. By default, it listens on `http://localhost:6006`. You can visit the app via a browser at the same address. (Phoenix does not send data over the internet. It only operates locally on your machine.)

```shell
python -m phoenix.server.main serve
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
python -m phoenix.server.main serve
phoenix serve

import logging
from typing import Any, Dict, Iterator, List, Mapping, Optional, Tuple, cast

from opentelemetry.semconv._incubating.attributes.gen_ai_attributes import (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

dependencies = [
"opentelemetry-api",
"opentelemetry-instrumentation",
"opentelemetry-semantic-conventions",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a minimum version here. I get an import error:

ImportError: cannot import name 'GEN_AI_REQUEST_SEED' from 'opentelemetry.semconv._incubating.attributes.gen_ai_attributes' (/Users/mikeldking/anaconda3/envs/phoenix/lib/python3.12/site-packages/opentelemetry/semconv/_incubating/attributes/gen_ai_attributes.py). Did you mean: 'GEN_AI_REQUEST_MODEL'?

@mikeldking
Copy link
Contributor

I hit a road block with the import of the conventions. Will dogfood tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:XXL This PR changes 1000+ lines, ignoring generated files.
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

☔ [feature request] Pydantic AI instrumentaiton
3 participants