What are the differences between `mesh_utils.create_device_mesh` and `np.array(devices).reshape` for sharding an array? #20693

yixiaoer · 2024-04-10T18:04:16Z

yixiaoer
Apr 10, 2024

There are two different methods for sharding an array, namely mesh_utils.create_device_mesh and np.array(devices).reshape:

import jax
from jax.experimental import mesh_utils
import jax.numpy as jnp
from jax.sharding import NamedSharding, Mesh, PartitionSpec as P
import numpy as np

devices = jax.devices()
a0 = jnp.zeros((1024, 2048))

devices_a1 = mesh_utils.create_device_mesh((1, 8))
devices_a2 = np.array(devices).reshape(1, 8)

sharding_a1 = NamedSharding(mesh=Mesh(devices_a1, ('a', 'b')), spec=P('a', 'b'))
sharding_a2 = NamedSharding(mesh=Mesh(devices_a2, ('a', 'b')), spec=P('a', 'b'))

a1 = jax.device_put(a0, sharding_a1)
a2 = jax.device_put(a0, sharding_a2)

jax.debug.visualize_array_sharding(a1)
jax.debug.visualize_array_sharding(a2)

I originally thought that these 2 methods would yield the same results, but it turns out that the order of devices is different:

Given the results are different, which sharding method should I use? Are there any performance differences between these 2 methods when performing parallel computations?

Possibly related to #19661.

Answered by jakevdp

Apr 10, 2024

The mesh_utils version takes into account the hardware geometry, and will result in a mesh with a more efficient layout for the particular hardware you are running on. For this reason you should use mesh_utils rather than constructing a mesh manually from the ordered list of devices.

View full answer

jakevdp · 2024-04-10T18:10:23Z

jakevdp
Apr 10, 2024
Maintainer

The mesh_utils version takes into account the hardware geometry, and will result in a mesh with a more efficient layout for the particular hardware you are running on. For this reason you should use mesh_utils rather than constructing a mesh manually from the ordered list of devices.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

What are the differences between `mesh_utils.create_device_mesh` and `np.array(devices).reshape` for sharding an array? #20693

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

What are the differences between mesh_utils.create_device_mesh and np.array(devices).reshape for sharding an array? #20693

Uh oh!

yixiaoer Apr 10, 2024

Replies: 1 comment

Uh oh!

Uh oh!

jakevdp Apr 10, 2024 Maintainer

What are the differences between `mesh_utils.create_device_mesh` and `np.array(devices).reshape` for sharding an array? #20693

yixiaoer
Apr 10, 2024

jakevdp
Apr 10, 2024
Maintainer