You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary:
Pull Request resolved: #426
tl;dr `comm actor`'s world id can be different from `ProcMesh`'s `world_id`. We should not attest it.
`Comm actor`, which is spawned through `ProcessAllocator`, gets its world ID from a uuid:
https://www.internalfb.com/code/fbsource/[a29e91672b216c626792ec9406e77a922b9d88dc]/fbcode/monarch/hyperactor_mesh/src/alloc/process.rs?lines=108-109%2C420
`ProcMesh` gets its `world_id` from the `Alloc` it uses.
https://www.internalfb.com/code/fbsource/[a29e91672b216c626792ec9406e77a922b9d88dc]/fbcode/monarch/hyperactor_mesh/src/proc_mesh.rs?lines=95%2C99-101%2C300%2C302%2C317
`MastAllocator` uses `RemoteProcessAlloc`, and explicitly uses the `task_group_name` as its `world_id`.
https://www.internalfb.com/code/fbsource/[a29e91672b216c626792ec9406e77a922b9d88dc]/fbcode/monarch/hyperactor_meta/src/alloc.rs?lines=274-275%2C284%2C394%2C397%2C399
For example, this is a log I get from running a test. `ProcMesh` has `test_task_group`, comm actor has `_1C8Rf4TR6jZe`.
> I0703 09:09:51.966694 416137 fbcode/monarch/hyperactor_mesh/src/actor_mesh.rs:154] binding actor mesh ProcMesh { world_id: WorldId("test_task_group"), shape: Shape { labels: ["hosts", "gpus"], slice: Slice { offset: 0, sizes: [1, 2], strides: [2, 1] } }, ranks: [(ProcId(WorldId("_1C8Rf4TR6jZe"), 0), (Unix(Bound("vujWCqQQ55kCU2NQQlCC2q3N" (abstract))), ActorRef { actor_id: ActorId(ProcId(WorldId("_1C8Rf4TR6jZe"), 0), "mesh", 0), phantom: PhantomData<hyperactor_mesh::proc_mesh::mesh_agent::MeshAgent> })), (ProcId(WorldId("_1C8Rf4TR6jZe"), 1), (Unix(Bound("vujWCqQQ55kCU2NQQlCC2q3N" (abstract))), ActorRef { actor_id: ActorId(ProcId(WorldId("_1C8Rf4TR6jZe"), 1), "mesh", 0), phantom: PhantomData<hyperactor_mesh::proc_mesh::mesh_agent::MeshAgent> }))], client: Mailbox { inner: State { actor_id: ActorId(ProcId(WorldId("test_task_group_manager"), 0), "client", 0), open_ports: [17082645012990790806], next_port: 1036 } }, comm_actors: [ActorRef { actor_id: ActorId(ProcId(WorldId("_1C8Rf4TR6jZe"), 0), "comm", 0), phantom: PhantomData<hyperactor_mesh::comm::CommActor> }, ActorRef { actor_id: ActorId(ProcId(WorldId("_1C8Rf4TR6jZe"), 1), "comm", 0), phantom: PhantomData<hyperactor_mesh::comm::CommActor> }] }
Reviewed By: mariusae
Differential Revision: D77737951
fbshipit-source-id: 4f70aad14e984b1bc89bea79bf3a3ed8218048e0
0 commit comments