You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
2025-07-21 09:58:17.622138: I external/xla/xla/pjrt/pjrt_c_api_client.h:379] c_to_cpp_device_map_:
2025-07-21 09:58:17.622143: I external/xla/xla/pjrt/pjrt_c_api_client.h:381] Key: 0x5609eef8b160, Value: 0x5609eef69070
2025-07-21 09:58:17.622148: I external/xla/xla/pjrt/pjrt_c_api_client.h:383] c_device: 0x5609eef8b160
Entering PJRT_Memory_Kind
Exiting PJRT_Memory_Kind
Entering function: PJRT_Device_DefaultMemory
Exiting function: PJRT_Device_DefaultMemory
2025-07-21 09:58:17.622170: I external/xla/xla/pjrt/pjrt_c_api_client.h:391] c_to_cpp_memory_map_:
2025-07-21 09:58:17.622178: I external/xla/xla/pjrt/pjrt_c_api_client.h:393] Key: 0x5609eef42c10, Value: 0x5609eef43fb0
2025-07-21 09:58:17.622186: I external/xla/xla/pjrt/pjrt_c_api_client.h:395] c_memory: 0x5609eef42c10
Traceback (most recent call last):
File "/xxx/distributed.py", line 15, in <module>
assert jax.device_count() == 2
^^^^^^^^^^^^^^^^^^^^^^^
AssertionError
As far as I know, JAX will define a root node and wait for connection of the other nodes. When every node is joined and ready, process continues. I wonder when node joined in group, how to recognize it ? I mean there must be a PJRT_Client or PJRT_Platform or whatever to incrementally add/push_back a distributed device incoming and to record the information of new deivce.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
I am implementing PJRT for my custom device and have 2 containers whose number of devices is 1.
I defined script
distributed.py
:docker-1.sh:
In different terminal of different docker container:
And finally I got:
As far as I know, JAX will define a root node and wait for connection of the other nodes. When every node is joined and ready, process continues. I wonder when node joined in group, how to recognize it ? I mean there must be a PJRT_Client or PJRT_Platform or whatever to incrementally add/push_back a distributed device incoming and to record the information of new deivce.
Thanks
Beta Was this translation helpful? Give feedback.
All reactions