Defining custom evaluators #1098
Unanswered
akshatshah-21
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Is there a good way to define custom evaluators for evals, and plug them in easily with the current evaluation framework?
Context: I need something like the TrajectoryEvaluator but the the way it compares tool calls needs to be flexible. Consider a generic search on internet tool as part of a query. To include this tool call in an eval, I would have to explicitly add the search query arg, and
TrajectoryEvaluator._are_tool_calls_equal
does a strict comparison between actual and expected tool calls:adk-python/src/google/adk/evaluation/trajectory_evaluator.py
Lines 84 to 96 in 1773cda
But I can not pre-determine what a search query would exactly be, since it's a string and the LLM is free to rephrase or properly format it, or add hyphens and so on.
In some cases, I would perhaps like to skip checking on the tool args entirely or partially.
Beta Was this translation helpful? Give feedback.
All reactions