Typo fix and removal of redundant field in the prompt

Salma Elshafey · Salma Elshafey · commit 3fa14f06c2f1 · 2025-07-02T21:45:13.000+03:00
diff --git a/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_tool_call_accuracy/_tool_call_accuracy.py b/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_tool_call_accuracy/_tool_call_accuracy.py
@@ -182,8 +182,7 @@ def _convert_kwargs_to_eval_input(self, **kwargs):
 
     @override
     async def _do_eval(self, eval_input: Dict) -> Dict[str, Union[float, str]]:  # type: ignore[override]
-        """Do a relevance evaluation.
-
+        """Do a tool call accuracy evaluation.
         :param eval_input: The input to the evaluator. Expected to contain
         whatever inputs are needed for the _flow method, including context
         and other fields depending on the child class.
@@ -245,7 +244,6 @@ async def _real_call(self, **kwargs):
     
     def _not_applicable_result(self, error_message):
         """Return a result indicating that the tool call is not applicable for evaluation.
-pr
         :param eval_input: The input to the evaluator.
         :type eval_input: Dict
         :return: A dictionary containing the result of the evaluation.
diff --git a/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_tool_call_accuracy/tool_call_accuracy.prompty b/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluators/_tool_call_accuracy/tool_call_accuracy.prompty
@@ -125,7 +125,6 @@ TOOL DEFINITION: {{tool_definition}}
 Your output should consist only of a JSON object, as provided in the examples, that has the following keys:
   - chain_of_thought: a string that explains your thought process to decide on the tool call accuracy level. Start this string with 'Let's think step by step:', and think deeply and precisely about which level should be chosen based on the agent's tool calls and how they were able to address the user's query.
   - tool_calls_success_level: a integer value between 1 and 5 that represents the level of tool call success, based on the level definitions mentioned before. You need to be very precise when deciding on this level. Ensure you are correctly following the rating system based on the description of each level.
-  - tool_calls_sucess_result: 'pass' or 'fail' based on the evaluation level of the tool call accuracy. Levels 1 and 2 are a 'fail', levels 3, 4 and 5 are a 'pass'.
   - additional_details: a dictionary that contains the following keys:
         - tool_calls_made_by_agent: total number of tool calls made by the agent
         - correct_tool_calls_made_by_agent: total number of correct tool calls made by the agent