-
Hi all. I'm trying to run an xarray computation, and I can't figure out how to do it in a way that doesn't blow my memory usage. >> foo
<xarray.DataArray 'foo' (time: 125560, lat: 192, lon: 288)>
dask.array<open_dataset-6cd755527450782900ac724b8dfc6443foo, shape=(125560, 192, 288), dtype=float32, chunksize=(496, 192, 288), chunktype=numpy.ndarray>
Coordinates:
* lat (lat) float64 -90.0 -89.06 -88.12 -87.17 ... 87.17 88.12 89.06 90.0
* lon (lon) float64 0.0 1.25 2.5 3.75 5.0 ... 355.0 356.2 357.5 358.8
* time (time) datetime64[ns] 2015-01-01 ... 2100-12-31T18:00:00
>> time
<xarray.DataArray 'time' (date: 31390, lat: 192, lon: 288)>
dask.array<open_dataset-ec860d52be70a7aa840f50a6c063d53ctime, shape=(31390, 192, 288), dtype=datetime64[ns], chunksize=(1962, 12, 36), chunktype=numpy.ndarray>
Coordinates:
* date (date) datetime64[ns] 2015-01-01 2015-01-02 ... 2100-12-31
* lat (lat) float64 -90.0 -89.06 -88.12 -87.17 ... 87.17 88.12 89.06 90.0
* lon (lon) float64 0.0 1.25 2.5 3.75 5.0 ... 355.0 356.2 357.5 358.8 I'd like to run Okay, I have a cluster available via But when I run this under the distributed scheduler, xarray seems to want to open
I don't understand this behaviour — neither the eager evaluation, nor why xarray feels like it needs to gather the entire Is there a way to run this indexing in a parallelised fashion? Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
This is not supported by dask. On
The indexer needs to be in memory for vectorized indexing. |
Beta Was this translation helpful? Give feedback.
This is not supported by dask. On
main
I get the nice error:The indexer needs to be in memory for vectorized indexing.