Replies: 2 comments 40 replies
-
that thread is about lazy concatenation without using In [10]: import xarray as xr
...: from xarray.tests import raise_if_dask_computes
...:
...: ds = xr.tutorial.open_dataset("rasm").chunk({"time": 8, "x": 5, "y": 5})
...:
...: with raise_if_dask_computes(max_computes=0):
...: ds1 = ds.isel(time=slice(0, 18))
...: ds2 = ds.isel(time=slice(18, None))
...:
...: concat = xr.concat([ds1, ds2], dim="time")
...:
...: xr.testing.assert_identical(ds, concat)
In [11]: Instead of doing this manually using a loop you should also be able to use |
Beta Was this translation helpful? Give feedback.
-
@keewis so I am able to create a 21 GB of
But what I don't like about this solution is I am heavily dependent on default functionalities of
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi.
I have a lot of satellite images in my local directory. I want to create an xarray dataset from
n
satellite images (each satellite image is huge in memory). Normally, one would like to iteratively create xarray data array objects, concatenate them and then convert to xarray dataset. But this approach leads to memory issues asxr.concat
forces dask arrays to load (thread: #4628)What would be the right approach to solve this issue with xarrays and dask ?
Few things to note:
.tif
xr.open_rasterio
leads to dataset with max image (image with the maximum size) repeatedn
times wheren
is the number of images. I will have to resize each image to save size and then save back, which is a tedious work and does not seem an efficient way to accomplish this task.Below is my code on high level, but it I still run into memory issues with even after chunking with
dask
Beta Was this translation helpful? Give feedback.
All reactions