Closed
Description
Working my way through understanding cubed / cubed-xarray.
I'm trying to get an example working of modifying the chunking of an Xarray dataset and writing it to Zarr. When I roundtrip the Zarr to and from Xarray, it seems like the chunking structure hasn't changed. Is using the .chunk
method on an Xarray dataset with cubed viable or should I be using rechunk primitive?
Roundtrip example using Xarray + dask chunks
import xarray as xr
from zarr.storage import TempStore
ts = TempStore('air_temp_dask.zarr')
ds = xr.tutorial.open_dataset('air_temperature', chunks={})
rds = ds.chunk({'time':1})
rds.to_zarr(ts, consolidated=True)
rtds = xr.open_zarr(ts, chunks={})
rtds
assert rtds.chunks == rds.chunks
Roundtrip example using Xarray + cubed
from cubed import Spec
import xarray as xr
from zarr.storage import TempStore
ts = TempStore('air_temp_cubed.zarr')
spec = Spec(work_dir='tmp', allowed_mem='2GB')
ds = xr.tutorial.open_dataset('air_temperature', chunked_array_type='cubed',
from_array_kwargs={'spec': spec},chunks={})
rds = ds.chunk({'time':1}, chunked_array_type="cubed")
# does compute need to be called?
# rds.compute()
rds.to_zarr(ts, consolidated=True, chunkmanager_store_kwargs={'from_array_kwargs': {'spec': spec} })
rtds = xr.open_zarr(ts, chunked_array_type='cubed',
from_array_kwargs={'spec': spec},chunks={})
# This fails
assert rtds.chunks == rds.chunks
🤞 this is an end-of-day brain implementation issue on my end.