Creating multiple workers to do DQN #17755

raymondchua · 2023-09-24T23:26:27Z

raymondchua
Sep 24, 2023

Hi all, I hope this is the right place to ask this question. I would like be able to collect experience from 64 parallel environment copies into a single replay buffer. For exampling, using 8 parallel workers, using shared model parameters and averaging gradients between workers. Does anyone know if this can be done in Jax using vmap and jit? Perhaps similar to this: #11565 (reply in thread)

anh-tong · 2023-09-25T07:40:14Z

anh-tong
Sep 25, 2023

Hi,

It seems like you can pmap in this case. In my own experience, I use some of Flax utility functions (I think you can write your own similar function with no difficulty).

For examples,

flax.jax_utils.replicate to make shared model parameters available to all devices
flax.training.common_utils.shard to share data to all devices
jax.lax.pmean: use to average computed gradients

Here is a simple structure I use

def compute_fn(params, input):
    grad = <compute grad here>
    
    # note that `axis_name` should be the same with `pmap`
    grad = jax.lax.pmean(grad, axis_name="batch")
    
    return grad

# make model available to all devices
params = <model>
params = replicate(params)

# make `pmap` version of `compute_fn`
p_compute_fn = jax.pmap(compute_fn, axis_name="batch", donate_argnums=(0,))

# shard data across all devices
data = <input data>
data = shard(data)

grad = p_compute_fn(params, data)

# `grad` is still replicated across the devices, can be retrieved as
grad = unreplicate(grad)

This is what I usually do. Maybe there is a better approach out there.

Hope this could help.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Creating multiple workers to do DQN #17755

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Creating multiple workers to do DQN #17755

Uh oh!

raymondchua Sep 24, 2023

Replies: 1 comment

Uh oh!

Uh oh!

anh-tong Sep 25, 2023

raymondchua
Sep 24, 2023

anh-tong
Sep 25, 2023