Skip to content

Python: Workflow API: Support runtime checkpoint management in run_stream() #1536

@victordibia

Description

@victordibia

Design and discussion encouraged here.

Problem

The workflow API currently requires different methods for different execution contexts, unlike the unified agent API:

Agent API (unified):

agent.run_stream([message], thread=thread)  # Works for initial run, resumption, and responses

Workflow API (fragmented):

workflow.run_stream(message)                              # Initial run
workflow.run_stream_from_checkpoint(checkpoint_id, ...)   # Resume from checkpoint
workflow.send_responses_streaming(responses)              # Send HIL responses

This fragmentation creates challenges for applications that need to:

  • Manage checkpoint storage externally (e.g., distributed storage)
  • Scale workflow execution across multiple workers
  • Implement consistent HIL (human-in-the-loop) patterns
  • Maintain similar patterns between agent and workflow APIs

Current Limitations

  1. Multiple method surface area: Three separate methods (run_stream, run_stream_from_checkpoint, send_responses_streaming) instead of one unified interface
  2. Build-time checkpoint configuration: While run_stream_from_checkpoint() accepts runtime checkpoint storage, checkpointing must still be enabled at build-time for continuous checkpoint updates during execution
  3. No runtime checkpoint updates: Cannot pass a checkpoint storage at runtime and have it automatically updated throughout the run (unless configured at build-time)

Proposed Solution

Align workflow API with agent API patterns:

# Unified interface - checkpoint passed at runtime
workflow.run_stream([input], checkpoint=checkpoint)
workflow.run_stream([human_input_response], checkpoint=checkpoint)

# Checkpoint is updated automatically during execution
# Application manages checkpoint storage (enables distributed scenarios)

Benefits

  • Consistency: Aligns workflow and agent APIs, reducing cognitive load
  • Scalability: Applications can manage checkpoints externally, enabling distributed execution
  • Simplicity: Single method handles all execution contexts
  • Flexibility: Workflow definitions can be created once and executed with different checkpoint strategies

Related Discussion

#1354

cc @moonbox3

Metadata

Metadata

Assignees

Labels

pythonsquad: workflowsAgent Framework Workflows SquadworkflowsRelated to Workflows in agent-framework

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions