Skip to content

misleading error message for attempting to load non existing file #7435

Open
@kathoef

Description

@kathoef

What happened?

While trying to load a .h5 file using load_dataset, I accidently specified the wrong path. Instead of getting a "no such file or directory" error, however, I got a "did not find a match in any of xarray's currently installed IO backends ['netcdf4']" error. It took some time to find out that the problem was actually with the path, and not with my installed software libraries.

What did you expect to happen?

I would expect that load_dataset informs me with a "no such file or directory" error, and not with something refering to the IO backends, if I attempt to open a file, that is clearly not existing. For .nc files this seems to work, see below.

Minimal Complete Verifiable Example

import xarray
xarray.load_dataset('not-existing-file.h5')

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[12], line 1
----> 1 xarray.load_dataset('not-existing-file.h5')

File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/api.py:279, in load_dataset(filename_or_obj, **kwargs)
    276 if "cache" in kwargs:
    277     raise TypeError("cache has no effect in this context")
--> 279 with open_dataset(filename_or_obj, **kwargs) as ds:
    280     return ds.load()

File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/api.py:524, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, inline_array, backend_kwargs, **kwargs)
    521     kwargs.update(backend_kwargs)
    523 if engine is None:
--> 524     engine = plugins.guess_engine(filename_or_obj)
    526 backend = plugins.get_backend(engine)
    528 decoders = _resolve_decoders_kwargs(
    529     decode_cf,
    530     open_backend_dataset_parameters=backend.open_dataset_parameters,
   (...)
    536     decode_coords=decode_coords,
    537 )

File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/plugins.py:177, in guess_engine(store_spec)
    169 else:
    170     error_msg = (
    171         "found the following matches with the input file in xarray's IO "
    172         f"backends: {compatible_engines}. But their dependencies may not be installed, see:\n"
    173         "https://docs.xarray.dev/en/stable/user-guide/io.html \n"
    174         "https://docs.xarray.dev/en/stable/getting-started-guide/installing.html"
    175     )
--> 177 raise ValueError(error_msg)

ValueError: did not find a match in any of xarray's currently installed IO backends ['netcdf4']. Consider explicitly selecting one of the installed engines via the ``engine`` parameter, or installing additional IO dependencies, see:
https://docs.xarray.dev/en/stable/getting-started-guide/installing.html
https://docs.xarray.dev/en/stable/user-guide/io.html

Anything else we need to know?

It should be noted that the .h5 file is a working netcdf file, that can be loaded and used w/o installing further libraries if the path is correctly specified. Interestingly, for attempting to load a non existing .nc file, the load_dataset error message correctly says "FileNotFoundError: [Errno 2] No such file or directory: b'/home/jovyan/my_materials/not-existing-file.nc'".

Example code,

import xarray
xarray.load_dataset('not-existing-file.nc')

Error message,

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/file_manager.py:209, in CachingFileManager._acquire_with_cache_info(self, needs_lock)
    208 try:
--> 209     file = self._cache[self._key]
    210 except KeyError:

File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/lru_cache.py:55, in LRUCache.__getitem__(self, key)
     54 with self._lock:
---> 55     value = self._cache[key]
     56     self._cache.move_to_end(key)

KeyError: [<class 'netCDF4._netCDF4.Dataset'>, ('/home/jovyan/my_materials/not-existing-file.nc',), 'r', (('clobber', True), ('diskless', False), ('format', 'NETCDF4'), ('persist', False)), 'ae3bbd85-042b-46e1-97ae-f8d523bb578a']

During handling of the above exception, another exception occurred:

FileNotFoundError                         Traceback (most recent call last)
Cell In[11], line 1
----> 1 xarray.load_dataset('not-existing-file.nc')

File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/api.py:279, in load_dataset(filename_or_obj, **kwargs)
    276 if "cache" in kwargs:
    277     raise TypeError("cache has no effect in this context")
--> 279 with open_dataset(filename_or_obj, **kwargs) as ds:
    280     return ds.load()

File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/api.py:540, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, inline_array, backend_kwargs, **kwargs)
    528 decoders = _resolve_decoders_kwargs(
    529     decode_cf,
    530     open_backend_dataset_parameters=backend.open_dataset_parameters,
   (...)
    536     decode_coords=decode_coords,
    537 )
    539 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None)
--> 540 backend_ds = backend.open_dataset(
    541     filename_or_obj,
    542     drop_variables=drop_variables,
    543     **decoders,
    544     **kwargs,
    545 )
    546 ds = _dataset_from_backend_dataset(
    547     backend_ds,
    548     filename_or_obj,
   (...)
    556     **kwargs,
    557 )
    558 return ds

File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/netCDF4_.py:572, in NetCDF4BackendEntrypoint.open_dataset(self, filename_or_obj, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta, group, mode, format, clobber, diskless, persist, lock, autoclose)
    551 def open_dataset(
    552     self,
    553     filename_or_obj,
   (...)
    568     autoclose=False,
    569 ):
    571     filename_or_obj = _normalize_path(filename_or_obj)
--> 572     store = NetCDF4DataStore.open(
    573         filename_or_obj,
    574         mode=mode,
    575         format=format,
    576         group=group,
    577         clobber=clobber,
    578         diskless=diskless,
    579         persist=persist,
    580         lock=lock,
    581         autoclose=autoclose,
    582     )
    584     store_entrypoint = StoreBackendEntrypoint()
    585     with close_on_error(store):

File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/netCDF4_.py:376, in NetCDF4DataStore.open(cls, filename, mode, format, group, clobber, diskless, persist, lock, lock_maker, autoclose)
    370 kwargs = dict(
    371     clobber=clobber, diskless=diskless, persist=persist, format=format
    372 )
    373 manager = CachingFileManager(
    374     netCDF4.Dataset, filename, mode=mode, kwargs=kwargs
    375 )
--> 376 return cls(manager, group=group, mode=mode, lock=lock, autoclose=autoclose)

File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/netCDF4_.py:323, in NetCDF4DataStore.__init__(self, manager, group, mode, lock, autoclose)
    321 self._group = group
    322 self._mode = mode
--> 323 self.format = self.ds.data_model
    324 self._filename = self.ds.filepath()
    325 self.is_remote = is_remote_uri(self._filename)

File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/netCDF4_.py:385, in NetCDF4DataStore.ds(self)
    383 @property
    384 def ds(self):
--> 385     return self._acquire()

File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/netCDF4_.py:379, in NetCDF4DataStore._acquire(self, needs_lock)
    378 def _acquire(self, needs_lock=True):
--> 379     with self._manager.acquire_context(needs_lock) as root:
    380         ds = _nc4_require_group(root, self._group, self._mode)
    381     return ds

File ~/my-pykernel/lib/python3.11/contextlib.py:137, in _GeneratorContextManager.__enter__(self)
    135 del self.args, self.kwds, self.func
    136 try:
--> 137     return next(self.gen)
    138 except StopIteration:
    139     raise RuntimeError("generator didn't yield") from None

File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/file_manager.py:197, in CachingFileManager.acquire_context(self, needs_lock)
    194 @contextlib.contextmanager
    195 def acquire_context(self, needs_lock=True):
    196     """Context manager for acquiring a file."""
--> 197     file, cached = self._acquire_with_cache_info(needs_lock)
    198     try:
    199         yield file

File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/file_manager.py:215, in CachingFileManager._acquire_with_cache_info(self, needs_lock)
    213     kwargs = kwargs.copy()
    214     kwargs["mode"] = self._mode
--> 215 file = self._opener(*self._args, **kwargs)
    216 if self._mode == "w":
    217     # ensure file doesn't get overridden when opened again
    218     self._mode = "a"

File src/netCDF4/_netCDF4.pyx:2463, in netCDF4._netCDF4.Dataset.__init__()

File src/netCDF4/_netCDF4.pyx:2026, in netCDF4._netCDF4._ensure_nc_success()

FileNotFoundError: [Errno 2] No such file or directory: b'/home/jovyan/my_materials/not-existing-file.nc'

Environment

INSTALLED VERSIONS

commit: None
python: 3.11.0 | packaged by conda-forge | (main, Oct 25 2022, 06:24:40) [GCC 10.4.0]
python-bits: 64
OS: Linux
OS-release: 5.4.0-136-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.8.1

xarray: 2022.12.0
pandas: 1.5.2
numpy: 1.24.1
scipy: None
netCDF4: 1.6.2
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 65.6.3
pip: 22.3.1
conda: None
pytest: None
mypy: None
IPython: 8.8.0
sphinx: None

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions