Skip to content

Add a Context dataclass and Pipe abstraction for transforming context between components #227

@willwoodward

Description

@willwoodward

Short description
Introduce a serializable, well-typed Context dataclass to carry structured data, metadata, and processing history between components. Add a simple Pipe abstraction (callable/Protocol or small base-class) so components and the Task Master can compose and apply context transformations in a predictable, testable way.

Why

  • The repo currently has no Context/Pipe primitive found in the codebase (searches returned no matches).
  • A dedicated Context object makes it explicit what data flows between components (inputs, agents, knowledge bases, outputs, memory).
  • Pipes (transform functions or objects) allow reusable data transformations (sanitization, enrichment, logging, routing) to be composed and configured in .ww workflows.
  • Improves type safety, testability, and debuggability of the system’s component interactions.

Proposed design / API

  1. Context dataclass (suggested location: woodwork/interfaces/context.py or woodwork/core/context.py)
  • Fields:
    • id: Optional[str] = None
    • data: Dict[str, Any] = field(default_factory=dict) # main payload
    • metadata: Dict[str, Any] = field(default_factory=dict) # routing, source, tags, priority
    • history: List[Dict[str, Any]] = field(default_factory=list) # recorded steps with timestamps
    • created_at: datetime = field(default_factory=datetime.utcnow)
    • updated_at: datetime = field(default_factory=datetime.utcnow)
  • Methods:
    • copy(self, deep: bool = True) -> "Context"
    • update_data(self, key: str, value: Any) -> None
    • merge(self, other: Mapping[str, Any]) -> None
    • add_metadata(self, key: str, value: Any) -> None
    • record_step(self, name: str, info: Optional[Mapping[str, Any]] = None) -> None
    • to_dict(self) -> Dict[str, Any] and classmethod from_dict(cls, d: Dict[str, Any]) -> "Context"
    • optionally: getitem/setitem convenience to access data

Notes:

  • Keep the dataclass lightweight and JSON-serializable for persistence / caching.
  • Consider immutability or an explicit "frozen" mode later; for now mutable instances with explicit copy() are fine.
  1. Pipe abstraction (suggested location: woodwork/interfaces/pipe.py)
    Two simple API options (pick one):

A) Functional approach (minimal)

  • Define Pipe = Callable[[Context], Context]
  • Users can implement functions that mutate or return new Context objects.
  • Support composition utilities: compose_pipes(pipes: Iterable[Pipe]) -> Pipe

B) Protocol / Base class (more explicit)

  • class Pipe(Protocol):
    def call(self, ctx: Context) -> Context: ...
    or

  • class BasePipe(ABC):
    def transform(self, ctx: Context) -> Context: ...
    def call(self, ctx: Context) -> Context: return self.transform(ctx)

  • Advantage: easier to add lifecycle methods (initialize, close), configuration, logging.

  1. Integration points
  • Task Master (woodwork/core/task_master.py)
    • Accept and pass Context objects rather than ad-hoc dicts or multiple parameters.
    • Provide a hook to apply a pipeline of Pipes before/after invoking a component (pre/post).
    • Log Context.history entries for auditability.
  • Component interfaces (woodwork/interfaces/* and implementations under woodwork/components/)
    • Update component base classes/interfaces to accept Context and return Context (or None for side-effect components).
    • For backward compatibility, add adapter shims or helper functions to wrap old signatures.
  • Parser / config (.ww) (woodwork/parser/config_parser.py)
    • Optionally add syntax for declaring Pipes or referencing named pipes in component declarations, e.g.:
      • output_component = output { pipes = [sanitize_pipe, format_pipe] }
    • Or allow configuring pipelines as a property on workflows / component wiring.
  • Examples directory
    • Add an example pipeline in examples/ demonstrating Context + a few pipes (sanitize, enrich, persist).
  1. Example usage (pseudocode)
  • Python-level:

    • ctx = Context(data={"user_input": "Hello world"})
    • def sanitize_pipe(ctx: Context) -> Context:
      ctx.data["user_input"] = sanitize_text(ctx.data["user_input"])
      ctx.record_step("sanitize", {"status": "ok"})
      return ctx
    • def enrich_pipe(ctx: Context) -> Context:
      ctx.metadata["language"] = detect_language(ctx.data["user_input"])
      ctx.record_step("enrich")
      return ctx
    • pipeline = [sanitize_pipe, enrich_pipe]
    • for pipe in pipeline:
      ctx = pipe(ctx)
    • component.process(ctx) # components accept Context and return/update Context
  • .ww config (conceptual)

    • parser should be extended to map a named pipe list to the pipeline used by a component.
  1. Tests
  • Unit tests
    • Context serialization/deserialization (to_dict/from_dict)
    • copy() semantics (deep vs shallow)
    • record_step adds entries with timestamps
    • Pipe composition returns a Context in expected shape
    • Basic component adapters that accept Context and produce expected output
  • Integration tests
    • A small workflow executed via Task Master that applies a pipeline across components and asserts that data and history flow correctly
  • Add tests under tests/ to cover new interfaces
  1. Documentation
  • Document Context fields, intended usage, and best practices.
  • Add example(s) to examples/ and README/docs explaining how to author Pipes and attach them to components or workflows.

Backward-compatibility & migration

  • Do not break existing component APIs in a single change; instead:
    • Add Context-accepting methods alongside current interfaces, or
    • Provide adapter helpers to convert (args -> Context) and (Context -> args)
  • Plan a follow-up to fully migrate components to Context once adapters and tests are in place.

Acceptance criteria (What "done" looks like)

  • New Context dataclass implemented with required fields and helper methods.
  • Pipe abstraction introduced (either functional or class-based) with composition utilities.
  • Task Master updated to accept/pass Context objects and to apply configured pipes.
  • Component base/interface updated (or adapters added) to accept Context.
  • Parser/config updated or extended (or documented) to reference/attach named pipes to components or workflows.
  • Unit and integration tests added for Context and Pipe functionality.
  • Examples and docs updated demonstrating end-to-end usage.
  • No breaking changes to existing behaviors (or clear migration/adaptor included).
  • CI tests pass.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions