Fast way to check if a variable is in a list of netCDF files? Or load only the files that has a list of variables. #7718
-
Let's say I have a long list of files import xarray as xr
from tqdm import tqdm
fnames = [fname for fname in tqdm(all_fnames) if param in list(xr.open_dataset(fname))] This is returning
Is there a faster way to do that? Any suggestions? My idea is actually to use |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 3 replies
-
Using from netCDF4 import Dataset
fnames = [fname for fname in tqdm(all_fnames) if param in Dataset(fname).variables] which returns
|
Beta Was this translation helpful? Give feedback.
-
You might check with Update: This should work. Normally it should be in the same range as netCDF4, maybe a bit faster. import h5py
fnames = [fname for fname in tqdm(all_fnames) if param in h5py.File(fname).keys()] Not sure if and when the open file handles go out of scope and are closed finally. |
Beta Was this translation helpful? Give feedback.
You might check with
h5netcdf
or even justh5py
to retrieve dataset/variable names. But of course only for hdf5 based netcdf flavours.Update:
This should work. Normally it should be in the same range as netCDF4, maybe a bit faster.
Not sure if and when the open file handles go out of scope and are closed fin…