Skip to content

Feat: Safe OWL Society Termination #450

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

a7m-1st
Copy link

@a7m-1st a7m-1st commented Apr 1, 2025

This Stop OWL feature enables users to terminate from long running Multi Turn processes safely. It does it by using present mechanisms to break the for loop. Current limitation is that it breaks the loop after a full turn, thus taking time if CoT in single Agent is involved, drilling Event to ChatbotAgent and society creation is required to achieve almost immediate termination. This PR closes #362 by @didier-durand

Screen.Recording.2025-04-01.025734.mp4

Added:

  1. stop_owl function. It triggers the STOP_REQUESTED Thread Event
  2. I passed this event to run_society() and added a check in the for loop.
answer, chat_history, token_info = run_society(
      society=society, 
      stop_event=STOP_REQUESTED
  )

Changed:

  1. I had to move live_logs function to separate thread. This code was causing the Event queue to wait until the process_with_live_logs() was over, queueing thus invalidating stop_owl() events
def run_button_event():
.....
 while bg_thread.is_alive():
      # Update conversation record display
      logs2 = get_latest_logs(100, LOG_QUEUE)

      # Always update status
      yield (
          "0",
          "<span class='status-indicator status-running'></span> Processing...",
          logs2,
      )

      time.sleep(1)
  1. After changing to async, asyncio threads doesn't support Generators (yield) while we need to send real-time data back to components. So, caused State management to change to Global or Session state.

Limitation of Current Approach:

  1. The STOP_REQUESTED Event is getting triggered on time, but the condition check is getting delayed until a full loop is complete i.e. If Chain Of Thought (CoT) is getting processed by anyone of agents, then it won't terminate. Perhaps anyone can help Drilling the STOP_REQUESTED event deep inside Chat Agent or society creation.
  2. A different issue is causing the feature not to be that useful. I have mentioned it in this issue 👉 Building Society Simulation Error when Rerunning #449

Notes on State Mgmt:

  • This code uses Global State, thus making Web Page refreshes stateful. But will make it share state to other users if sharing is enabled:

app.launch(share=True)

  • This can be solved easily solved by refactoring to Gradio's Session state (gr.State({})). But due to the nature of decoupled processing, run_owl() also needs to be stateless. This side effect will cause gradio to create a new state after refreshing the page while the deamon is running OWL

Just a note, I found Gradio Auto refresh was turned off intentionally, I think it was to simplify the code by using Generators instead of managing State but I am interested to know the reason (Not to break anything :D).
Currently all changes are stored to a global STATE then app updates components every second

app.load(update_interface, outputs=[token_count_output, status_output, log_display2, run_button, stop_button], every=1)

@didier-durand
Copy link
Contributor

This will be really useful! Thanks

Copy link
Member

@Wendong-Fan Wendong-Fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @a7m-1st ! After we stopped the running seems we can't run another task anymore, could we fix this?

@a7m-1st
Copy link
Author

a7m-1st commented Apr 9, 2025

My pleasure @Wendong-Fan. About that one, I was facing the same issue previously too for some reason. I have pointed that out in issue #449 . But I will tend to it too.

a7m-1st added a commit to camel-ai/camel that referenced this pull request May 2, 2025
@a7m-1st
Copy link
Author

a7m-1st commented May 2, 2025

Screen.Recording.2025-05-02.144053.mp4

In video can see:

  • Multiple Rerun is now possible ✅
  • Browser instances are successfully closing ✅
  • Token data successfully returning ✅

Updates since 81e2574

  1. fix: Revert button state when clicked stop button
  2. With the help of feat: terminate BrowserToolkit to allow rerun  camel#2194 attempts to partially fix issue Building Society Simulation Error when Rerunning #449
  3. Termination is more atomic with the help of integrating the stop_event to ChatAgent in feat: add termination parameter to ChatAgent step and astep camel#2285

Limitation:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a "Stop Task" button to the UI
3 participants