-
Notifications
You must be signed in to change notification settings - Fork 384
Description
First check
- I added a descriptive title to this issue.
- I used the GitHub search to look for a similar issue and didn't find it.
- I searched the Marvin documentation for this feature.
Describe the current behavior
The tool call for a state update can lead to an infinite loop if there is an error that the app can not resolve.
- prompt the application
- marvin attempts to add a state change
- e.g. pydantic throws error
- marvin attempts to fix the error and tries again
3... - ...
Loops 3. and 4.
Describe the proposed behavior
- I would like the ability to limit the number of attempts
- I would like the ability to limit the number of tokens per call/ application
something like this:
app.say("blabla", max_completion_tokens=100, max_retries=3)
app = application(...., max_completion_tokens_per_say=100, max_retries_per_say=3)
Example Use
I have a complex state for which marvin applications often struggle to make valid changes to the state. This can lead to "infinite loops". I would like to have better control over the maximum cost per run.
Additional context
I have started with an implementation of the idea, but I'm uncertain if I'm approaching it the correct way.
One could update the run_async.py function in src/marvin/beta/assistants/run.py to count the number of successive submit_tool_calls like this:
max_iterations = 2 # How many attempts at fixing the tool call are allowed
iteration_count = 0
while handler.current_run.status in ["requires_action"] and iteration_count <= max_iterations:
tool_outputs = await self.get_tool_outputs(run=handler.current_run)
handler = event_handler_class(**self.event_handler_kwargs)
async with client.beta.threads.runs.submit_tool_outputs_stream(
thread_id=self.thread.id,
run_id=self.run.id,
tool_outputs=tool_outputs,
event_handler=handler,
) as stream:
await stream.until_done()
await self._update_run_from_handler(handler)
iteration_count += 1
if iteration_count >= max_iterations:
logger.debug("Maximum iterations reached, stoping tool output submission.")
To limit the per run token usage the OpenAI API already provides the max_completion_tokens argument . Maybe its enough to pass the argument further to the say function/ application class?