Memory-efficient way to obtain a categorical sample #16862

blackblitz · 2023-07-27T10:44:47Z

blackblitz
Jul 27, 2023

I would like to obtain a large sample of size n from a categorical distribution with a large number of categories m. However, I often run into an out-of-memory error. I have noticed from the traceback message that the function builds an array with shape (n, m), so the memory requirement is 4mn. For example, if I take a sample of size 10000 from a categorical distribution with 10000 categories, then I need 4 * 10^10 bytes, which is very huge. Is there a way to do that with less memory?

jakevdp · 2023-07-27T13:49:09Z

jakevdp
Jul 27, 2023
Maintainer

JAX's categorical sampler works by performing an argmax over a gumbel distribution: https://github.com/google/jax/blob/a03d6e66137e0bb79350f5af81d39cb4c27e4a70/jax/_src/random.py#L1501-L1504

I think this is where the large memory footprint is coming from. One way you could address this is by doing a scan over smaller batches of categorical samples.

2 replies

blackblitz Aug 5, 2023
Author

Could you show how to implement the scan?

jakevdp Aug 7, 2023
Maintainer

Sure, rather than something like this:

import jax
import jax.numpy as jnp

key = jax.random.PRNGKey(0)
logits = jnp.ones(10000)

out = jax.random.categorical(key, logits, shape=(10000,))

You could use something like this:

out = jax.lax.map(lambda key: jax.random.categorical(key, logits, shape=()),
                  jax.random.split(key, 10000))

Both produce 10000 categorical samples, but the second approach is done sequentially rather than attempting to allocate all intermediate values in a single array.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Memory-efficient way to obtain a categorical sample #16862

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Memory-efficient way to obtain a categorical sample #16862

Uh oh!

blackblitz Jul 27, 2023

Replies: 1 comment · 2 replies

Uh oh!

jakevdp Jul 27, 2023 Maintainer

Uh oh!

blackblitz Aug 5, 2023 Author

Uh oh!

jakevdp Aug 7, 2023 Maintainer

blackblitz
Jul 27, 2023

Replies: 1 comment 2 replies

jakevdp
Jul 27, 2023
Maintainer

blackblitz Aug 5, 2023
Author

jakevdp Aug 7, 2023
Maintainer