Replies: 1 comment 1 reply
-
Our current groupby and resample implementations are inefficient with dask arrays, i suspect that's what's happening. It would be good to check chunk sizes of Since you're resampling from hourly to daily, I would try If you're up for experimenting, you could try |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I am working with large datasets and trying to do some basic computation on these large files. Here is the size/dimensions of the ERA5_Land_wind DataSet: [time (hours): 351384, latitude: 271, longitude: 611]. I can very quickly convert to wind speed (ERA5_WS) and even resample to daily means pretty quickly (10 to 20 seconds). However, when I try to load the ERA5_WS_daily (daily means) DataArray into memory (size is about 7GB, [time (days): 14642, latitude: 271, longitude: 611]) it consumes all my memory (128GB). Shouldn't it only consume the size of the resampled DataArray? Also, when I try different chunk sizes, resampling to daily means (ERA5_WS_daily_chunked) consumes all my memory. Something seems off here. Any help would be great.
Hopefully, this code snippet is helpful:
Beta Was this translation helpful? Give feedback.
All reactions