Skip to content

Commit 3fa14f0

Browse files
author
Salma Elshafey
committed
Typo fix and removal of redundant field in the prompt
1 parent 4c27dff commit 3fa14f0

File tree

2 files changed

+1
-4
lines changed

2 files changed

+1
-4
lines changed

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_tool_call_accuracy/_tool_call_accuracy.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -182,8 +182,7 @@ def _convert_kwargs_to_eval_input(self, **kwargs):
182182

183183
@override
184184
async def _do_eval(self, eval_input: Dict) -> Dict[str, Union[float, str]]: # type: ignore[override]
185-
"""Do a relevance evaluation.
186-
185+
"""Do a tool call accuracy evaluation.
187186
:param eval_input: The input to the evaluator. Expected to contain
188187
whatever inputs are needed for the _flow method, including context
189188
and other fields depending on the child class.
@@ -245,7 +244,6 @@ async def _real_call(self, **kwargs):
245244

246245
def _not_applicable_result(self, error_message):
247246
"""Return a result indicating that the tool call is not applicable for evaluation.
248-
pr
249247
:param eval_input: The input to the evaluator.
250248
:type eval_input: Dict
251249
:return: A dictionary containing the result of the evaluation.

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_tool_call_accuracy/tool_call_accuracy.prompty

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -125,7 +125,6 @@ TOOL DEFINITION: {{tool_definition}}
125125
Your output should consist only of a JSON object, as provided in the examples, that has the following keys:
126126
- chain_of_thought: a string that explains your thought process to decide on the tool call accuracy level. Start this string with 'Let's think step by step:', and think deeply and precisely about which level should be chosen based on the agent's tool calls and how they were able to address the user's query.
127127
- tool_calls_success_level: a integer value between 1 and 5 that represents the level of tool call success, based on the level definitions mentioned before. You need to be very precise when deciding on this level. Ensure you are correctly following the rating system based on the description of each level.
128-
- tool_calls_sucess_result: 'pass' or 'fail' based on the evaluation level of the tool call accuracy. Levels 1 and 2 are a 'fail', levels 3, 4 and 5 are a 'pass'.
129128
- additional_details: a dictionary that contains the following keys:
130129
- tool_calls_made_by_agent: total number of tool calls made by the agent
131130
- correct_tool_calls_made_by_agent: total number of correct tool calls made by the agent

0 commit comments

Comments
 (0)