Description
What happened?
While trying to load a .h5 file using load_dataset
, I accidently specified the wrong path. Instead of getting a "no such file or directory" error, however, I got a "did not find a match in any of xarray's currently installed IO backends ['netcdf4']" error. It took some time to find out that the problem was actually with the path, and not with my installed software libraries.
What did you expect to happen?
I would expect that load_dataset
informs me with a "no such file or directory" error, and not with something refering to the IO backends, if I attempt to open a file, that is clearly not existing. For .nc files this seems to work, see below.
Minimal Complete Verifiable Example
import xarray
xarray.load_dataset('not-existing-file.h5')
MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
Relevant log output
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[12], line 1
----> 1 xarray.load_dataset('not-existing-file.h5')
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/api.py:279, in load_dataset(filename_or_obj, **kwargs)
276 if "cache" in kwargs:
277 raise TypeError("cache has no effect in this context")
--> 279 with open_dataset(filename_or_obj, **kwargs) as ds:
280 return ds.load()
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/api.py:524, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, inline_array, backend_kwargs, **kwargs)
521 kwargs.update(backend_kwargs)
523 if engine is None:
--> 524 engine = plugins.guess_engine(filename_or_obj)
526 backend = plugins.get_backend(engine)
528 decoders = _resolve_decoders_kwargs(
529 decode_cf,
530 open_backend_dataset_parameters=backend.open_dataset_parameters,
(...)
536 decode_coords=decode_coords,
537 )
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/plugins.py:177, in guess_engine(store_spec)
169 else:
170 error_msg = (
171 "found the following matches with the input file in xarray's IO "
172 f"backends: {compatible_engines}. But their dependencies may not be installed, see:\n"
173 "https://docs.xarray.dev/en/stable/user-guide/io.html \n"
174 "https://docs.xarray.dev/en/stable/getting-started-guide/installing.html"
175 )
--> 177 raise ValueError(error_msg)
ValueError: did not find a match in any of xarray's currently installed IO backends ['netcdf4']. Consider explicitly selecting one of the installed engines via the ``engine`` parameter, or installing additional IO dependencies, see:
https://docs.xarray.dev/en/stable/getting-started-guide/installing.html
https://docs.xarray.dev/en/stable/user-guide/io.html
Anything else we need to know?
It should be noted that the .h5 file is a working netcdf file, that can be loaded and used w/o installing further libraries if the path is correctly specified. Interestingly, for attempting to load a non existing .nc file, the load_dataset error message correctly says "FileNotFoundError: [Errno 2] No such file or directory: b'/home/jovyan/my_materials/not-existing-file.nc'".
Example code,
import xarray
xarray.load_dataset('not-existing-file.nc')
Error message,
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/file_manager.py:209, in CachingFileManager._acquire_with_cache_info(self, needs_lock)
208 try:
--> 209 file = self._cache[self._key]
210 except KeyError:
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/lru_cache.py:55, in LRUCache.__getitem__(self, key)
54 with self._lock:
---> 55 value = self._cache[key]
56 self._cache.move_to_end(key)
KeyError: [<class 'netCDF4._netCDF4.Dataset'>, ('/home/jovyan/my_materials/not-existing-file.nc',), 'r', (('clobber', True), ('diskless', False), ('format', 'NETCDF4'), ('persist', False)), 'ae3bbd85-042b-46e1-97ae-f8d523bb578a']
During handling of the above exception, another exception occurred:
FileNotFoundError Traceback (most recent call last)
Cell In[11], line 1
----> 1 xarray.load_dataset('not-existing-file.nc')
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/api.py:279, in load_dataset(filename_or_obj, **kwargs)
276 if "cache" in kwargs:
277 raise TypeError("cache has no effect in this context")
--> 279 with open_dataset(filename_or_obj, **kwargs) as ds:
280 return ds.load()
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/api.py:540, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, inline_array, backend_kwargs, **kwargs)
528 decoders = _resolve_decoders_kwargs(
529 decode_cf,
530 open_backend_dataset_parameters=backend.open_dataset_parameters,
(...)
536 decode_coords=decode_coords,
537 )
539 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None)
--> 540 backend_ds = backend.open_dataset(
541 filename_or_obj,
542 drop_variables=drop_variables,
543 **decoders,
544 **kwargs,
545 )
546 ds = _dataset_from_backend_dataset(
547 backend_ds,
548 filename_or_obj,
(...)
556 **kwargs,
557 )
558 return ds
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/netCDF4_.py:572, in NetCDF4BackendEntrypoint.open_dataset(self, filename_or_obj, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta, group, mode, format, clobber, diskless, persist, lock, autoclose)
551 def open_dataset(
552 self,
553 filename_or_obj,
(...)
568 autoclose=False,
569 ):
571 filename_or_obj = _normalize_path(filename_or_obj)
--> 572 store = NetCDF4DataStore.open(
573 filename_or_obj,
574 mode=mode,
575 format=format,
576 group=group,
577 clobber=clobber,
578 diskless=diskless,
579 persist=persist,
580 lock=lock,
581 autoclose=autoclose,
582 )
584 store_entrypoint = StoreBackendEntrypoint()
585 with close_on_error(store):
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/netCDF4_.py:376, in NetCDF4DataStore.open(cls, filename, mode, format, group, clobber, diskless, persist, lock, lock_maker, autoclose)
370 kwargs = dict(
371 clobber=clobber, diskless=diskless, persist=persist, format=format
372 )
373 manager = CachingFileManager(
374 netCDF4.Dataset, filename, mode=mode, kwargs=kwargs
375 )
--> 376 return cls(manager, group=group, mode=mode, lock=lock, autoclose=autoclose)
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/netCDF4_.py:323, in NetCDF4DataStore.__init__(self, manager, group, mode, lock, autoclose)
321 self._group = group
322 self._mode = mode
--> 323 self.format = self.ds.data_model
324 self._filename = self.ds.filepath()
325 self.is_remote = is_remote_uri(self._filename)
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/netCDF4_.py:385, in NetCDF4DataStore.ds(self)
383 @property
384 def ds(self):
--> 385 return self._acquire()
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/netCDF4_.py:379, in NetCDF4DataStore._acquire(self, needs_lock)
378 def _acquire(self, needs_lock=True):
--> 379 with self._manager.acquire_context(needs_lock) as root:
380 ds = _nc4_require_group(root, self._group, self._mode)
381 return ds
File ~/my-pykernel/lib/python3.11/contextlib.py:137, in _GeneratorContextManager.__enter__(self)
135 del self.args, self.kwds, self.func
136 try:
--> 137 return next(self.gen)
138 except StopIteration:
139 raise RuntimeError("generator didn't yield") from None
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/file_manager.py:197, in CachingFileManager.acquire_context(self, needs_lock)
194 @contextlib.contextmanager
195 def acquire_context(self, needs_lock=True):
196 """Context manager for acquiring a file."""
--> 197 file, cached = self._acquire_with_cache_info(needs_lock)
198 try:
199 yield file
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/file_manager.py:215, in CachingFileManager._acquire_with_cache_info(self, needs_lock)
213 kwargs = kwargs.copy()
214 kwargs["mode"] = self._mode
--> 215 file = self._opener(*self._args, **kwargs)
216 if self._mode == "w":
217 # ensure file doesn't get overridden when opened again
218 self._mode = "a"
File src/netCDF4/_netCDF4.pyx:2463, in netCDF4._netCDF4.Dataset.__init__()
File src/netCDF4/_netCDF4.pyx:2026, in netCDF4._netCDF4._ensure_nc_success()
FileNotFoundError: [Errno 2] No such file or directory: b'/home/jovyan/my_materials/not-existing-file.nc'
Environment
INSTALLED VERSIONS
commit: None
python: 3.11.0 | packaged by conda-forge | (main, Oct 25 2022, 06:24:40) [GCC 10.4.0]
python-bits: 64
OS: Linux
OS-release: 5.4.0-136-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.8.1
xarray: 2022.12.0
pandas: 1.5.2
numpy: 1.24.1
scipy: None
netCDF4: 1.6.2
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 65.6.3
pip: 22.3.1
conda: None
pytest: None
mypy: None
IPython: 8.8.0
sphinx: None