You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_tool_call_accuracy/_tool_call_accuracy.py
Copy file name to clipboardExpand all lines: sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_tool_call_accuracy/tool_call_accuracy.prompty
Your output should consist only of a JSON object, as provided in the examples, that has the following keys:
126
126
- chain_of_thought: a string that explains your thought process to decide on the tool call accuracy level. Start this string with 'Let's think step by step:', and think deeply and precisely about which level should be chosen based on the agent's tool calls and how they were able to address the user's query.
127
127
- tool_calls_success_level: a integer value between 1 and 5 that represents the level of tool call success, based on the level definitions mentioned before. You need to be very precise when deciding on this level. Ensure you are correctly following the rating system based on the description of each level.
128
-
- additional_details: a dictionary that contains the following keys:
128
+
- details: a dictionary that contains the following keys:
129
129
- tool_calls_made_by_agent: total number of tool calls made by the agent
130
130
- correct_tool_calls_made_by_agent: total number of correct tool calls made by the agent
131
-
- details: a list of dictionaries, each containing:
131
+
- per_tool_call_details: a list of dictionaries, each containing:
132
132
- tool_name: name of the tool
133
133
- total_calls_required: total number of calls required for the tool
134
134
- correct_calls_made_by_agent: number of correct calls made by the agent
0 commit comments