Replies: 1 comment 1 reply
-
Just a reminder: for numerical computations python loops (even those in comprehensions) are very slow, so you'd best avoid those whenever possible, or use If you truly want to stay with this kind of ragged array, I'd recommend looking at |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I was pleased to recently discover xarray, Thanks for all videos and tutorials.
I'm not sure if I already understand xarray's capabilities well enough.
Currently using numpy, starting with (a small example) of ndarrays with shapes (5520, 3) and (6667, 3) with float64 data that fit easily in 16 Gb RAM memory. For my use case the numpy calculations intermediate come together in a ndarray of shape (k, m, n) or (5520, 5520, 6667) with float64 data that would require 1.5 Tb of RAM. Fortunately I don't require all these data points, so I wrote a for loop and use numpy's masking and fancy indexing to filter only the required data points. In this for loop I used two python lists to create a list of lists. Then converting this list of list into a numpy ndarray for further processing. The shape of this numpy ndarray is ((k), (m*, n*)) or ((5520), (m*, n*)). Note that (k) is separated from (m, n). So for each element of k there is a (m*, n*) ndarray where m* and n* vary in size (with m* = 1 to 5520 and n* = 0 - 6667). This makes it hard to analyse memory consumption (I wrote a for loop to figure out there is still 7.2 Gb required). So swapping to disk causes the for loop to be slowwww.
Can this be solved with xarray? Would it require to store intermediate data on disk?
Beta Was this translation helpful? Give feedback.
All reactions