Replies: 2 comments
-
@maedoc Did you end up finding the answer? Is it possible this is just a lucky result. I would imagine: Sharding the Sparse-Index-Arrays, might lead to situations in which their index does not correspond to the Sharded-Dense-Array (e.g. I shard the index array, and the 4th component contains an index which is specified on the 6th dense sharded part). |
Beta Was this translation helpful? Give feedback.
-
@jcandane I think the result is that yes, the placement of computation really does just follow the placement of data, as promised. The downside, for my use cases, was that the performance is not nearly as good as I expected, at least on CPU, e.g. a basic CSR SpMV routine in C multithreaded with OpenMP is more performant. Other experiments with Jax yielded the conclusion that XLA (the LLVM backend that Jax targets) seems pretty good at GPU, but for high performance CPU, one can look elsewhere: in one example of Monte Carlo simulations, the NumPy code is indeed 4x slower than Jax, but Jax is 4-8x slower than a C implementation which uses SIMD effectively. Even tho the data layout is the same in the Jax and C code, so that SIMD can be used, XLA seems generate mainly scalar code, which, for CPU, is a big performance loss. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
When knocking my head against the wall with
pmap
over a simple sparse matrix example, I started reading the docs on sharding and expected it to be harder. My question is, have I misunderstood what's going on? am I going to hit some snag on the communication in the background? Here's a MWEAs far as I can tell, this actually works for parallelizing a sparse matrix-vector multiply, and there's no communication written out explicitly. If it's really this easy, it would perhaps share some examples, since I don't recall seeing anything int he docs about semantics of
x.at[idx].add
in sharding context and it completely surprised me that it worked (for bothy3
andy4
above).edit to add the output of the above script
Beta Was this translation helpful? Give feedback.
All reactions