|
| 1 | +# pgflow – High-Level Overview |
| 2 | + |
| 3 | +## Purpose |
| 4 | + |
| 5 | +pgflow exists to let developers orchestrate complex, _database-centric_ workflows without leaving Postgres/Supabase. |
| 6 | +It replaces external “control planes” (Airflow, Temporal, etc.) with a zero-deployment, SQL-native engine that is easy to reason about, easy to observe with ordinary SQL, and easy to scale via Supabase Edge Functions. |
| 7 | + |
| 8 | +## Design Philosophy |
| 9 | + |
| 10 | +1. Postgres is the **single source of truth** – every definition, every state transition, every queue lives inside the database. |
| 11 | +2. **Opinionated over configurable** – we pick a clear “happy path” (DAGs, JSON payloads, id/slug naming, topological flow definition, etc.). If you agree with the opinions you get an ultra-simple mental model; if you don’t, pgflow is not for you. |
| 12 | +3. **Robust yet simple** – ACID state transitions, transactional task polling, back-pressure via pgmq, but no exotic features (no dynamic DAG rewrites, no conditional branching for now, no custom persistence layers). |
| 13 | +4. **Compile-time safety** – the TypeScript DSL ensures inputs/outputs line up; the SQL core enforces referential integrity; migrations are generated and applied, never built on the fly. |
| 14 | +5. **Serverless-ready** – execution happens in stateless workers (Supabase Edge Functions) that can crash, scale, or redeploy at will. The database keeps the truth. |
| 15 | + |
| 16 | +## Core Building Blocks |
| 17 | + |
| 18 | +Layer 1 – Definition (DSL) |
| 19 | +• Strongly-typed TypeScript API (`@pgflow/dsl`). |
| 20 | +• Generates: |
| 21 | + – Flow shape (slug, steps, edges, runtime options). |
| 22 | + – Handler functions (executed by Edge Worker). |
| 23 | +• Can be _compiled_ to SQL via `pgflow compile`, producing an idempotent migration file that inserts the flow & steps. |
| 24 | + |
| 25 | +Layer 2 – Orchestration (SQL Core) |
| 26 | +• Pure SQL (tables, constraints, functions, triggers). |
| 27 | +• Tables |
| 28 | +– Static: `flows`, `steps`, `deps`. |
| 29 | + – Runtime: `runs`, `step_states`, `step_tasks`. |
| 30 | +• Functions |
| 31 | +– Definition: `create_flow`, `add_step`. |
| 32 | + – Runtime: `start_flow`, `poll_for_tasks`, `complete_task`, `fail_task`. |
| 33 | +• Guarantees |
| 34 | +– All operations are transactional. |
| 35 | + – Visibility timeouts and exponential back-off handled in SQL. |
| 36 | + – Foreign keys enforce DAG correctness (steps must be added topologically). |
| 37 | + |
| 38 | +Layer 3 – Execution (Edge Worker) |
| 39 | +• A tiny Node/deno script packaged as Supabase Edge Function. |
| 40 | +• Polls pgmq queue via `poll_for_tasks`, executes handler, calls back `complete_task` / `fail_task`. |
| 41 | +• Auto-restarts, trivial horizontal scaling, no local state. |
| 42 | + |
| 43 | +## Supporting Tooling |
| 44 | + |
| 45 | +CLI (`@pgflow/cli`) |
| 46 | +• `install` – copies SQL migrations, patches `supabase/config.toml`, seeds `.env`. |
| 47 | +• `compile` – turns a `.ts` flow into a timestamped migration (`YYYYMMDDHHMMSS_create_<slug>_flow.sql`). |
| 48 | +• Plans: scaffold flows, local run/monitor, edge-worker template. |
| 49 | + |
| 50 | +Website / Docs (`pgflow.dev`) |
| 51 | +• Built with Astro / MDX, lives in `pkgs/website`. |
| 52 | +• Source of truth for conventions, examples, troubleshooting. |
| 53 | + |
| 54 | +## Tech Stack |
| 55 | + |
| 56 | +• PostgreSQL ≥14 (tested on 14-16). |
| 57 | +• pgmq for at-least-once message queueing. |
| 58 | +• Supabase (local & cloud) as the default distribution vehicle. |
| 59 | +• TypeScript, Deno (for compile step), Node (for CLI). |
| 60 | +• nx monorepo; packages in `pkgs/*`; strict lint, prettier, vitest. |
| 61 | + |
| 62 | +## Key Conventions (Non-Negotiable) |
| 63 | + |
| 64 | +• Slugs: ≤128 chars, no leading digit/underscore, no spaces, only `[a-zA-Z0-9_]`. |
| 65 | +• DAG only – no cycles, no conditional edges, no runtime mutation. |
| 66 | +• Steps are added in topological order (enforced by FK). |
| 67 | +• Handlers **must return JSON-serialisable values**. |
| 68 | +• Inputs/Outputs are immutable; changing a step requires bumping the flow slug. |
| 69 | +• Retries: `max_attempts≥1`, `base_delay≥1s`, `timeout≥3s`. |
| 70 | +• Supabase `config.toml` must enable connection pooler (`transaction` mode) and set `edge_runtime.policy = "per_worker"` (handled by `pgflow install`). |
| 71 | + |
| 72 | +## Trade-offs & Non-Goals |
| 73 | + |
| 74 | +• No long-running (>15 min) tasks inside Edge Worker yet - Edge Functions have a time limit |
| 75 | +• No sub-flows / dynamic fan-out yet (fan-out _within_ a step via `task_index` roadmap). |
| 76 | +• No cron / schedule engine – start runs manually or via your own triggers, use Supabase pg_cron for recurring tasks. |
| 77 | + |
| 78 | +## Typical Flow Life-Cycle |
| 79 | + |
| 80 | +1. Author flow in TypeScript. |
| 81 | +2. `npx pgflow compile` → migration SQL. |
| 82 | +3. `supabase migration up` → schema + flow definitions live in Postgres. |
| 83 | +4. Client calls `SELECT pgflow.start_flow('my_flow', '{"foo":"bar"}')`. |
| 84 | +5. Worker polls, runs handlers, pushes results – state transitions in SQL. |
| 85 | +6. When `remaining_steps = 0` the run is marked `completed`; aggregated output stored; your app can `LISTEN` or poll. |
| 86 | + |
| 87 | +## Why This Document Matters |
| 88 | + |
| 89 | +All subsequent AI-assisted code changes reference this overview as the “north star”. |
| 90 | +When we debate an architectural choice, add a CLI sub-command, or fix a bug, we check that it: |
| 91 | +• Keeps the Postgres-first, three-layer model intact. |
| 92 | +• Respects the opinionated conventions above. |
| 93 | +• Preserves the **robust-yet-simple** ethos. |
| 94 | + |
| 95 | +If a change conflicts with these principles we redesign rather than bolt on complexity. |
0 commit comments