You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_tool_call_accuracy/_tool_call_accuracy.py
-6Lines changed: 0 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -77,8 +77,6 @@ class ToolCallAccuracyEvaluator(PromptyEvaluatorBase[Union[str, float]]):
77
77
_INVALID_SCORE_MESSAGE="Tool call accuracy score must be between 1 and 5."
78
78
79
79
_LLM_SCORE_KEY="tool_calls_success_level"
80
-
_EXCESS_TOOL_CALLS_KEY="excess_tool_calls"
81
-
_MISSING_TOOL_CALLS_KEY="missing_tool_calls"
82
80
83
81
id="id"
84
82
"""Evaluator identifier, experimental and to be used only with evaluation in cloud."""
Copy file name to clipboardExpand all lines: sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_tool_call_accuracy/tool_call_accuracy.prompty
+10-10Lines changed: 10 additions & 10 deletions
Original file line number
Diff line number
Diff line change
@@ -136,15 +136,15 @@ Your output should consist only of a JSON object, as provided in the examples, t
136
136
- correct_tool_percentage: percentage of correct calls made by the agent for this tool. It is a value between 0.0 and 1.0
137
137
- tool_call_errors: number of errors encountered during the tool call
138
138
- tool_success_result: 'pass' or 'fail' based on the evaluation of the tool call accuracy for this tool
139
-
- excess_tool_calls: a dictionary with the following keys:
140
-
- total: total number of excess, unnecessary tool calls made by the agent
141
-
- details: a list of dictionaries, each containing:
142
-
- tool_name: name of the tool
143
-
- excess_count: number of excess calls made for this query
144
-
- missing_tool_calls: a dictionary with the following keys:
145
-
- total: total number of missing tool calls that should have been made by the agent to be able to answer the query
146
-
- details: a list of dictionaries, each containing:
147
-
- tool_name: name of the tool
148
-
- missing_count: number of missing calls for this query
139
+
- excess_tool_calls: a dictionary with the following keys:
140
+
- total: total number of excess, unnecessary tool calls made by the agent
141
+
- details: a list of dictionaries, each containing:
142
+
- tool_name: name of the tool
143
+
- excess_count: number of excess calls made for this query
144
+
- missing_tool_calls: a dictionary with the following keys:
145
+
- total: total number of missing tool calls that should have been made by the agent to be able to answer the query
146
+
- details: a list of dictionaries, each containing:
147
+
- tool_name: name of the tool
148
+
- missing_count: number of missing calls for this query
0 commit comments