Skip to content

Can't isel on VirtualiZarr of multi-dimensional NetCDF files #647

@alexgleith

Description

@alexgleith

I have what I would hope is a simple use case for VirtualiZarr, which is to create a lazy-loaded xarray of existing NetCDF files.

Code I'm using currently is this:

from virtualizarr import open_virtual_mfdataset
from obstore.store import HTTPStore
from virtualizarr.parsers import HDFParser

files = [
    "https://thredds.nci.org.au/thredds/fileServer/gb6/BRAN/BRAN2020/month/ocean_temp_mth_1993_01.nc",
    "https://thredds.nci.org.au/thredds/fileServer/gb6/BRAN/BRAN2020/month/ocean_temp_mth_1993_02.nc",
]

parser = HDFParser()
store = HTTPStore(url="https://thredds.nci.org.au")

ds = open_virtual_mfdataset(files, object_store=store, parser=parser)
ds

This works ok, although datetimes are in float64.

Image

When it comes to selecting a single time and depth though, it fails.

one = ds.temp.isel(Time=0, st_ocean=0)
one

With this error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[3], line 1
----> 1 one = ds.temp.isel(Time=0, st_ocean=0)
      2 one

File /opt/homebrew/lib/python3.11/site-packages/xarray/core/dataarray.py:1555, in DataArray.isel(self, indexers, drop, missing_dims, **indexers_kwargs)
   1550     return self._from_temp_dataset(ds)
   1552 # Much faster algorithm for when all indexers are ints, slices, one-dimensional
   1553 # lists, or zero or one-dimensional np.ndarray's
-> 1555 variable = self._variable.isel(indexers, missing_dims=missing_dims)
   1556 indexes, index_variables = isel_indexes(self.xindexes, indexers)
   1558 coords = {}

File /opt/homebrew/lib/python3.11/site-packages/xarray/core/variable.py:1023, in Variable.isel(self, indexers, missing_dims, **indexers_kwargs)
   1020 indexers = drop_dims_from_indexers(indexers, self.dims, missing_dims)
   1022 key = tuple(indexers.get(dim, slice(None)) for dim in self.dims)
-> 1023 return self[key]

File /opt/homebrew/lib/python3.11/site-packages/xarray/core/variable.py:769, in Variable.__getitem__(self, key)
    766 dims, indexer, new_order = self._broadcast_indexes(key)
    767 indexable = as_indexable(self._data)
--> 769 data = indexing.apply_indexer(indexable, indexer)
    771 if new_order:
    772     data = duck_array_ops.moveaxis(data, range(len(new_order)), new_order)

File /opt/homebrew/lib/python3.11/site-packages/xarray/core/indexing.py:1029, in apply_indexer(indexable, indexer)
   1027     return indexable.oindex[indexer]
   1028 else:
-> 1029     return indexable[indexer]

File /opt/homebrew/lib/python3.11/site-packages/xarray/core/indexing.py:1555, in NumpyIndexingAdapter.__getitem__(self, indexer)
   1551 # We want 0d slices rather than scalars. This is achieved by
   1552 # appending an ellipsis (see
   1553 # https://numpy.org/doc/stable/reference/arrays.indexing.html#detailed-notes).
   1554 key = indexer.tuple + (Ellipsis,)
-> 1555 return array[key]

File /opt/homebrew/lib/python3.11/site-packages/virtualizarr/manifests/array.py:261, in ManifestArray.__getitem__(self, key)
    256     raise TypeError(
    257         f"indexer must be of type int, slice, ellipsis, None, or np.ndarray; or a tuple of such types. Got {key}"
    258     )
    260 # check value is valid
--> 261 indexer = _possibly_expand_trailing_ellipsis(indexer, self.ndim)
    262 if len(indexer) != self.ndim:
    263     raise ValueError(
    264         f"Invalid indexer for array. Indexer length must be less than or equal to the number of dimensions in the array, "
    265         f"but indexer={indexer} has length {len(indexer)} and array has {self.ndim} dimensions."
    266         f"\nIf concatenating using xarray, ensure all non-coordinate data variables to be concatenated include the concatenation dimension, "
    267         f"or consider passing `data_vars='minimal'` and `coords='minimal'` to the xarray combining function."
    268     )

File /opt/homebrew/lib/python3.11/site-packages/virtualizarr/manifests/array.py:366, in _possibly_expand_trailing_ellipsis(indexer, ndim)
    364 if final_dim_indexer == ...:
    365     if len(indexer) > ndim:
--> 366         raise ValueError(
    367             f"Invalid indexer for array. Indexer length must be less than or equal to the number of dimensions in the array, "
    368             f"but indexer={indexer} has length {len(indexer)} and array has {ndim} dimensions."
    369             f"\nIf concatenating using xarray, ensure all non-coordinate data variables to be concatenated include the concatenation dimension, "
    370             f"or consider passing `data_vars='minimal'` and `coords='minimal'` to the xarray combining function."
    371         )
    373     extra_slices_needed = ndim - (len(indexer) - 1)
    374     *indexer_as_list, ellipsis = indexer

ValueError: Invalid indexer for array. Indexer length must be less than or equal to the number of dimensions in the array, but indexer=(0, 0, slice(None, None, None), slice(None, None, None), Ellipsis) has length 5 and array has 4 dimensions.
If concatenating using xarray, ensure all non-coordinate data variables to be concatenated include the concatenation dimension, or consider passing `data_vars='minimal'` and `coords='minimal'` to the xarray combining function.

Doing a regular load like this works fine (with the caveat that it's reading from opendap URLs):

ds = xr.open_mfdataset(
    files_od,
    decode_timedelta=True,
    engine="netcdf4"
) 
one = ds.temp.isel(Time=0, st_ocean=0)
one.plot.imshow()

VirtualiZarr version is virtualizarr-1.3.3.dev81+ga5d04d7 (latest dev as of 1 July 2025)

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions