Skip to content

Conversation

@LIHUA919
Copy link
Collaborator

Summary

This PR implements fine-grained token usage tracking for individual tool calls in ChatAgent, addressing issue #3219.

Changes

  • Added token_usage field to ToolCallingRecord to store token usage information for each tool call
  • Modified ChatAgent._record_tool_calling() to accept and store token usage data
  • Updated ChatAgent._execute_tool() and _aexecute_tool() to propagate token usage from LLM responses
  • Modified both sync and async tool execution loops in _step_impl() and _astep_impl() to pass token usage to tool records
  • Enhanced ToolCallingRecord.__str__() to display token usage when available
  • Added comprehensive unit test to verify token tracking functionality
  • Created example script demonstrating the new token tracking feature

Technical Details

When multiple tools are called in a single LLM response, they share the same token usage dict from that response. This allows users to monitor the cost of each tool call and
optimize agent workflows for cost efficiency.

The token_usage dict contains standard OpenAI-format keys:

  • prompt_tokens: Number of tokens in the prompt
  • completion_tokens: Number of tokens in the completion
  • total_tokens: Total tokens used

Use Case

This feature is particularly valuable for cost-sensitive applications like the Eigent search agent mentioned in the issue, where different tools (e.g., Google Search vs browser
tools) have varying token costs.

Testing

  • Added unit test: test_tool_calling_token_usage_tracking
  • Existing tests pass without modification (backward compatible)
  • Example code provided: examples/agents/tool_calling_with_token_tracking.py

Add token_usage field to ToolCallingRecord to monitor the cost of each
individual tool call. This enables fine-grained cost tracking when
multiple tools are invoked in a single agent step.

Changes:
- Add token_usage field to ToolCallingRecord
- Update ChatAgent to pass token usage from LLM responses to tool records
- Add unit test for token tracking
- Add example demonstrating the feature
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 24, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@fengju0213 fengju0213 added this to the Sprint 41 milestone Oct 27, 2025
Copy link
Collaborator

@fengju0213 fengju0213 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LIHUA919 thanks for your contribution? left some comment below

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try to test this example ,occur error

           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/suntao/Documents/GitHub/camel/camel/agents/chat_agent.py", line 3121, in _record_tool_calling
    tool_record = ToolCallingRecord(
                  ^^^^^^^^^^^^^^^^^^
  File "/Users/suntao/Documents/GitHub/camel/.venv/lib/python3.12/site-packages/pydantic/main.py", line 250, in __init__
    validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 2 validation errors for ToolCallingRecord
token_usage.completion_tokens_details
  Input should be a valid integer [type=int_type, input_value={'accepted_prediction_tok...d_prediction_tokens': 0}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.12/v/int_type
token_usage.prompt_tokens_details
  Input should be a valid integer [type=int_type, input_value={'audio_tokens': 0, 'cached_tokens': 0}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.12/v/int_type

# Process all tool calls
# All tools in this batch share the same token usage from the
# LLM call that generated them
current_token_usage = response.usage_dict
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think after multiturn request response.usage_dict cannot accurately represent the usage of a toolcall

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants