-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Initial Checks
- I confirm that I'm using the latest version of Pydantic AI
- I confirm that I searched for my issue in https://github.com/pydantic/pydantic-ai/issues before opening this issue
Description
There may be a bug in how cached_tokens are handled in PydanticAI when using Langfuse.
According to the Langfuse documentation:
Usage types can be arbitrary strings and differ by LLM provider. At the highest level, they can simply be input and output. As LLMs grow more sophisticated, additional usage types are necessary, such as cached_tokens, audio_tokens, and image_tokens.
In the UI, Langfuse summarizes all usage types that include the string input as input usage types, and similarly those including output as output usage types. If no total usage type is ingested, Langfuse sums up all usage type units to compute the total.
However, in PydanticAI, cached_tokens are currently logged as gen_ai.usage.details.cached_tokens. This causes Langfuse to treat them as an “other” usage type, rather than aggregating them properly. As a result, the total usage and cost calculations may be incorrect in Langfuse.

You can see from the image that cached_tokens is calculated as part of total usage rather than being grouped under input.
I believe this behavior should be adjusted in PydanticAI—perhaps conditionally—when Langfuse is used as the tracing backend (instead of Logfire), to ensure compatibility.
Example Code
Python, Pydantic AI & LLM client version
python: 3.13
"logfire[httpx]>=3.17.0
"pydantic-ai-slim[mcp,openai]>=0.3.5",
"pydantic-ai[logfire]>=0.3.5",