Skip to content

remove extra usage additions during yielding in gemini agent #1577

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

almeidaalajoel
Copy link

@almeidaalajoel almeidaalajoel commented Apr 24, 2025

The gemini Agent is overcounting tokens by adding the usage for each chunk returned from the gemini client. Each chunk has the total usage data up to that chunk.

Instead of summing the usage, we should just track what the response said the usage was at each point.

@almeidaalajoel
Copy link
Author

almeidaalajoel commented Apr 24, 2025

hmm failing tests, it may be a more complex fix? is it gemini side with the issue?

looks like this may not be a perfect fix but i wanted to highlight the issue

@almeidaalajoel almeidaalajoel changed the title remove extra usage additions during yielding in gemini client remove extra usage additions during yielding in gemini agent Apr 24, 2025
@almeidaalajoel
Copy link
Author

almeidaalajoel commented Apr 24, 2025

For reference, printing each of the r objects as they come in looks like this for gemini-2.5-pro-exp:

{'candidates': [{'content': {'role': 'model', 'parts': [{'text': '...'}]}, 'index': 0}], 'usage_metadata': {'prompt_token_count': 21, 'candidates_token_count': 625, 'total_token_count': 646}, 'model_version': 'gemini-2.5-pro-preview-03-25'}

{'candidates': [{'content': {'role': 'model', 'parts': [{'text': '...'}]}, 'index': 0}], 'usage_metadata': {'prompt_token_count': 21, 'candidates_token_count': 650, 'total_token_count': 671}, 'model_version': 'gemini-2.5-pro-preview-03-25'}

{'candidates': [{'content': {'role': 'model', 'parts': [{'text': ' ...'}]}, 'index': 0}], 'usage_metadata': {'prompt_token_count': 21, 'candidates_token_count': 676, 'total_token_count': 697}, 'model_version': 'gemini-2.5-pro-preview-03-25'}

Clearly, the prompt_token_count should not be getting re-added with every chunk that comes in. And you can also see that the candidates_token_count is reporting the total up to the current chunk, not just the current chunk.

However, testing with 2.0 and 1.5 it looked like this:

{'candidates': [{'content': {'role': 'model', 'parts': [{'text': ''}]}}], 'usage_metadata': {'prompt_token_count': 21, 'total_token_count': 21}, 'model_version': 'gemini-1.5-flash'}

{'candidates': [{'content': {'role': 'model', 'parts': [{'text': '...'}]}}], 'usage_metadata': {'prompt_token_count': 21, 'total_token_count': 21}, 'model_version': 'gemini-1.5-flash'}

{'candidates': [{'content': {'role': 'model', 'parts': [{'text': '...'}]}}], 'usage_metadata': {'prompt_token_count': 21, 'total_token_count': 21}, 'model_version': 'gemini-1.5-flash'}

{'candidates': [{'content': {'role': 'model', 'parts': [{'text': '...'}]}}], 'usage_metadata': {'prompt_token_count': 21, 'total_token_count': 21}, 'model_version': 'gemini-1.5-flash'}

{'candidates': [{'content': {'role': 'model', 'parts': [{'text': "..."}]}}], 'usage_metadata': {'prompt_token_count': 21, 'total_token_count': 21}, 'model_version': 'gemini-1.5-flash'}

(removed text for easier reading, but it was quite clear that the number of tokens in each chunk was not the amount reported)

Not reporting the candidates_token_count at all (but the prompt_token_count would still be off here). Not sure if this is an issue from google reporting different things on different models?

@almeidaalajoel
Copy link
Author

https://ai.google.dev/api/generate-content#UsageMetadata

indeed the candidates_token_count is summing across all candidates, so summing them again here does not make sense

@DouweM
Copy link
Contributor

DouweM commented Apr 30, 2025

@almeidaalajoel Thank you, makes sense if this is documented by Google. Can you see if you can rebase on main and update the failing tests?

@DouweM DouweM marked this pull request as draft April 30, 2025 19:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants