Fix FP termination in step3_azure_ai_agent_group_chat.py #10771

thecsw · 2025-03-03T19:45:00Z

Motivation and Context

The example script will terminate prematurely because the agent responding "Not approved." will wrongfully trigger the termination condition, since "approved" substring will trigger.

Description

Exclude case-insensitive "not approved" from the termination criteria.

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows the SK Contribution Guidelines and the pre-submission formatting script raises no violations
All unit tests pass, and I have added new tests where possible
I didn't break anyone 😄

…hat_termination

TaoChenOSU · 2025-03-03T19:55:55Z

python/samples/getting_started_with_agents/azure_ai_agent/step3_azure_ai_agent_group_chat.py

+        # The agent would sometimes respond with simple "Not approved," which
+        # would trigger the termination. Even if the prompt clearly states not
+        # to use the word, it fails on 4o. This is a simple check to avoid that.
+        return "approved" in resp and not "not approved" in resp


This probably won't work for all cases. For example, this will trigger the termination too: "It wasn't approved."

I wonder if tuning the REVIEWER INSTRUCTIONS prompt will be a better solution.

Good note, I tried doing that first with updating the instruction blob with,

If not, provide insight on how to refine suggested copy without example, do NOT say it wasn't simply not approved.

LLMs being LLMs, they don't take commands and would still produce consistently,

# User: a slogan for a new line of electric cars. # CopyWriter: "Shockingly Smooth. Quietly Powerful." # ArtDirector: Not approved. While "Shockingly Smooth. Quietly Powerful." plays with the electric theme and contrasts the smoothness and quietness, it feels expected and perhaps too tame for a groundbreaking electric car line. A slogan should reflect the unique essence of the brand—why these cars matter within a crowded market. etc.

I haven't seen it saying something with "It wasn't approved", not putting it behind the model to generate, though.

We can adjust the termination keyword to something like TERMINATE. This can depend on the model used, too -- for example - gpt-4o-mini may handle it differently compared to gpt-4o.

Similar to what @TaoChenOSU said, I'm more keen on adjusting the Reviewer's instructions to better communicate the termination criteria -- that's either telling it to switch to using TERMINATE if approved. Or respond with APPROVED and instructing it to use all caps.

I understand we should have samples that work, but this is a sample. :) It should guide users towards what's possible in applications and it doesn't need to be the end-all-be-all.

I haven't been seeing issues with the original code. As an exercise, I did the following:

TERMINATATION_KEYWORD = "approved" class ApprovalTerminationStrategy(TerminationStrategy): """A strategy for determining when an agent should terminate.""" async def should_agent_terminate(self, agent, history): """Check if the agent should terminate.""" return TERMINATATION_KEYWORD in history[-1].content.lower() REVIEWER_NAME = "ArtDirector" REVIEWER_INSTRUCTIONS = f""" You are an art director who has opinions about copywriting born of a love for David Ogilvy. The goal is to determine if the given copy is acceptable to print. If the copy is acceptable, state your approval with the single word "{TERMINATATION_KEYWORD}." Do not use the word "{TERMINATATION_KEYWORD}" unless you are giving approval. If not, provide insight on how to refine suggested copy without example. There is no need to be nit-picky, if the slogan is acceptable, say so and complete the chat. """

I get these types of results:

# AuthorRole.USER: 'a slogan for a new line of electric cars.' # AuthorRole.ASSISTANT - CopyWriter: '"Drive the Future: Shockingly Efficient."' # AuthorRole.ASSISTANT - ArtDirector: 'The slogan needs refinement. Consider simplifying the message for clarity and impact. Emphasize the benefits of electric cars more directly, and avoid clichés. Focus on uniqueness and a strong call to action instead.' # AuthorRole.ASSISTANT - CopyWriter: '"Electric Life: Join the Charge."' # AuthorRole.ASSISTANT - ArtDirector: 'Approved.'

@thecsw I'm not convinced there are updates required for the sample. As I mentioned before, these are supposed to be "getting started" samples and should provide some inspiration for how one can start -- the dev can absolutely take the sample and improve it on their own.

Hi, @moonbox3! Apologies for a delay. Yes, thinking about it more, it works as is. "Stochastic nature of LLMs, etc." sometimes it writes "Not approved" and sometimes it goes through and gives expected nudges. In any case, even if this is an issue—since this is an onboarding script, if they do encounter this, they'd have a great opportunity to debug and learn more.

thecsw and others added 2 commits March 2, 2025 13:19

Stronger termination condition for step3_azure_ai_agent_group_chat.

7bd2958

Merge branch 'microsoft:main' into sandy/step3_azure_ai_agent_group_c…

631d367

…hat_termination

thecsw requested a review from a team as a code owner March 3, 2025 19:45

markwallace-microsoft added the python Pull requests for the Python Semantic Kernel label Mar 3, 2025

github-actions bot changed the title ~~Sandy/step3 azure ai agent group chat termination~~ Python: Sandy/step3 azure ai agent group chat termination Mar 3, 2025

thecsw changed the title ~~Python: Sandy/step3 azure ai agent group chat termination~~ Fix FP termination in step3_azure_ai_agent_group_chat.py Mar 3, 2025

TaoChenOSU reviewed Mar 3, 2025

View reviewed changes

thecsw closed this Mar 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix FP termination in step3_azure_ai_agent_group_chat.py #10771

Fix FP termination in step3_azure_ai_agent_group_chat.py #10771

Uh oh!

thecsw commented Mar 3, 2025

Uh oh!

TaoChenOSU Mar 3, 2025

Uh oh!

thecsw Mar 3, 2025 •

edited

Loading

Uh oh!

moonbox3 Mar 3, 2025

Uh oh!

moonbox3 Mar 4, 2025

Uh oh!

moonbox3 Mar 12, 2025

Uh oh!

moonbox3 Mar 12, 2025

Uh oh!

thecsw Mar 12, 2025

Uh oh!

Uh oh!

Fix FP termination in step3_azure_ai_agent_group_chat.py #10771

Fix FP termination in step3_azure_ai_agent_group_chat.py #10771

Uh oh!

Conversation

thecsw commented Mar 3, 2025

Motivation and Context

Description

Contribution Checklist

Uh oh!

TaoChenOSU Mar 3, 2025

Choose a reason for hiding this comment

Uh oh!

thecsw Mar 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

moonbox3 Mar 3, 2025

Choose a reason for hiding this comment

Uh oh!

moonbox3 Mar 4, 2025

Choose a reason for hiding this comment

Uh oh!

moonbox3 Mar 12, 2025

Choose a reason for hiding this comment

Uh oh!

moonbox3 Mar 12, 2025

Choose a reason for hiding this comment

Uh oh!

thecsw Mar 12, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

thecsw Mar 3, 2025 •

edited

Loading