Batching sparse BCOO objects #26434

BPro2410 · 2025-02-10T12:00:44Z

BPro2410
Feb 10, 2025

Hi all,

I need to process batches from a sparse document-term matrix, retrieving both the densified rows and their corresponding indices. To enable JIT compilation, JAX requires me to convert the matrix into a sparse BCOO format.

However, indexing the sparse matrix efficiently is challenging. The best performance occurs when I first fully densify the matrix and then extract the batched rows. Unfortunately, this approach is impractical for large matrices, as it consumes too much memory. On the other hand, if I attempt to index the sparse matrix first and then densify only the selected rows, I run out of memory (OOM).

You find a minimal example below. I am running the code on AWS ml.g5.2xlarge instance. When executing the get_batch_2() function, I get the error:

XlaRuntimeError: RESOURCE_EXHAUSTED: Out of memory while trying to allocate 148330749608 bytes.

When executing the get_batch_1() function, things work. This is curious from my understanding since this function densifies the whole matrix, whereas the get_batch_2() function is only densifiyng the batch. If i double the number of documents, both approaches fail.

Can anyone help?

Thanks!

import numpy as np
from scipy.sparse import csr_matrix
from jax.experimental.sparse import BCOO
from jax import jit, random
import jax
import jax.numpy as jnp

# Number of documents and vocabulary size
num_docs = 100000
vocab_size = 25000

# Simulating a sparse document-term matrix
# Assuming on average each document contains 50 terms (randomly selected from vocabulary)
density = 50 / vocab_size

# Generate random sparse matrix
rows = np.random.randint(0, num_docs, int(num_docs * vocab_size * density))
cols = np.random.randint(0, vocab_size, int(num_docs * vocab_size * density))
data = np.random.randint(1, 5, int(num_docs * vocab_size * density))  # Random term frequencies between 1 and 4

# Create the csr_matrix
sparse_matrix = csr_matrix((data, (rows, cols)), shape=(num_docs, vocab_size), dtype = np.float32)


# Transfer to jax BCOO object
counts_jnp = BCOO.from_scipy_sparse(sparse_matrix) 
num_documents = counts_jnp.shape[0]


@jit
def get_batch_1(rng, Y):
    D_batch = random.choice(rng, jnp.arange(num_documents), shape=(1024,))
    return Y.todense()[D_batch], D_batch 

@jit
def get_batch_2(rng, Y):
    D_batch = random.choice(rng, jnp.arange(num_documents), shape=(1024,))
    return Y[D_batch].todense(), D_batch 


# -- WORKS --
Y_batch, D_batch = get_batch_1(random.PRNGKey(0), counts_jnp)

# -- FAILS - OOM --
Y_batch, D_batch = get_batch_2(random.PRNGKey(0), counts_jnp)

jakevdp · 2025-02-10T18:17:41Z

jakevdp
Feb 10, 2025
Maintainer

I think the issue here is that sparse gather is not implemented particularly efficiently, so if you avoid the sparse gather by converting to dense and using a dense gather, you see better memory performance.

2 replies

BPro2410 Feb 10, 2025
Author

Thanks for your response. Unfortunately, converting my matrices to a dense format consumes over 50GB of memory, which leads to issues and ultimately causes the process to abort. Grateful for every tipp on that.

jakevdp Feb 10, 2025
Maintainer

Yeah understood – to be honest, jax.experimental.sparse is probably not the right tool for the workflow you're running. There are no native sparse operations in XLA, and so the implementation of things like sparse-gather tends to be pretty inefficient, and those inefficiencies show when applied to large problems.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Batching sparse BCOO objects #26434

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Batching sparse BCOO objects #26434

Uh oh!

Uh oh!

BPro2410 Feb 10, 2025

Replies: 1 comment · 2 replies

Uh oh!

jakevdp Feb 10, 2025 Maintainer

Uh oh!

BPro2410 Feb 10, 2025 Author

Uh oh!

jakevdp Feb 10, 2025 Maintainer

BPro2410
Feb 10, 2025

Replies: 1 comment 2 replies

jakevdp
Feb 10, 2025
Maintainer

BPro2410 Feb 10, 2025
Author

jakevdp Feb 10, 2025
Maintainer