How to ppermute a dynamic shape tensor when using shard_map? #22311

MoFHeka · 2024-07-06T16:10:06Z

MoFHeka
Jul 6, 2024

For example

input = [0,1,2,3,4,5,6,7]

The input is regarded as one training sample and segmented in the batch dimension with shard_map. The 'i' is the dimension name.

device 0:
input = [0,1,2,3]

device 1:
input = [4,5,6,7]

And I need to ppermute the input between different devices.

device 0:
lax.ppermute(input[0:1], 'i', [(0,1),(1,0)])

device 1:
lax.ppermute(input[1:3], 'i', [(0,1),(1,0)])

Where jax.disable_jit() is useless.

Dispatching parallel data input dynamically is an important way to handle Embedding model parallelism in the recommender model. Please, tell me how to support this feature.

Here is a snapshot from Meta paper :

Please:

Check for duplicate requests.
Describe your goal, and if possible provide a code snippet with a motivating example.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to ppermute a dynamic shape tensor when using shard_map? #22311

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

How to ppermute a dynamic shape tensor when using shard_map? #22311

Uh oh!

Uh oh!

MoFHeka Jul 6, 2024

Replies: 0 comments

MoFHeka
Jul 6, 2024