Skip to content

doc/design: A Small Coordinator for a More Scalable and Isolated Materialize #33082

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

aljoscha
Copy link
Contributor

@aljoscha aljoscha commented Jul 18, 2025

@aljoscha aljoscha force-pushed the design-small-coordinator branch 9 times, most recently from 81383b7 to 211b670 Compare July 21, 2025 10:12
@aljoscha aljoscha changed the title DRAFT: A Small Coordinator for A More Scalable and Isolated Materialize doc/design: A Small Coordinator for A More Scalable and Isolated Materialize Jul 21, 2025
@aljoscha aljoscha changed the title doc/design: A Small Coordinator for A More Scalable and Isolated Materialize doc/design: A Small Coordinator for a More Scalable and Isolated Materialize Jul 21, 2025
@aljoscha aljoscha force-pushed the design-small-coordinator branch from d833825 to f16247d Compare July 21, 2025 19:24
@aljoscha aljoscha marked this pull request as ready for review July 21, 2025 19:24
Comment on lines 139 to 141
has too be involved when absolutely necessary. A good analogy might be CISC vs
RISC instruction sets, where CISC has fewer, more complex opcodes and RISC has
possibly more, but simpler opcodes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think you mixed up the analogy here! CISC ISAs usually have more opcodes than RISC ISAs.

I think you do want the coordinator to be "RISC" in terms of the complexity (less) and number (fewer) of commands. But RISC machines also have to execute more commands than CISC machines to implement the same logic, whereas I think we also want the coordinator to execute fewer commands than it does now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I messed up that one part. CISC is actually more opcodes and the opcodes are more complicated. So the analogy works quite well.

Where before the frontend would send 1 EXECUTE SELECT, it would send the smaller GET CATALOG, GET THIS CLIENT, GET THAT CLIENT messages. But the frontend would also keep those clients around so doesn't have to run those commands for every peek.

Comment on lines +175 to +176
- `controller_ready(compute)`: the compute controller signaling that a peek
result is ready and the Coordinator needs to act.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both controllers also become ready when frontiers advance. Probably doesn't show up here because it gets drowned out by the peek responses, but it is also something we should fix. The reason we wake up on all frontier changes is that there might have been watches registered for query lifecycle tracking and we need to check those somewhere. But we shouldn't check them on the coordinator main loop.

Comment on lines +266 to +269
controller](https://github.com/MaterializeInc/materialize/pull/29559). With
that work fully realized, both for the storage controller and the remaining
compute controller moments a visualization of the workflow would look like
this:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it already basically looks like this for the compute controller. At least it doesn't really do anything on process, just maybe sends a "run maintenance" command to the instance tasks and then returns a stashed response, if it has any.

Sending the "run maintenance" command is cheap. But we can also remove it, I think, by giving each instance task its own maintenance ticker.

layer.
- We don't want to improve throughput benchmark numbers, only remove the
Coordinator as a bottleneck. Our work might increase throughput numbers or
reveal similar bottlenecks in other parts of the system.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think moving the current work of peek sequencing to the frontend could increase QPS even without horizontal scaling, because:

  • Doing most of the peek sequencing work on the per-session frontend task would allow us to at least vertically scale envd inside a single machine when a particular user wants a bit more QPS.
  • I imagine that the current staged execution has some overhead, which would disappear if we had straight-line code instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants