Skip to content

Fast way to check if a variable is in a list of netCDF files? Or load only the files that has a list of variables. #7718

Answered by kmuehlbauer
iuryt asked this question in Q&A
Discussion options

You must be logged in to vote

Using netCDF4 package is faster, but still, I would like to see if there is a better way doing it with xarray

from netCDF4 import Dataset
fnames = [fname for fname in tqdm(all_fnames) if param in Dataset(fname).variables]

which returns

100%|██████████| 1295/1295 [00:12<00:00, 101.96it/s]

You might check with h5netcdf or even just h5py to retrieve dataset/variable names. But of course only for hdf5 based netcdf flavours.

Update:

This should work. Normally it should be in the same range as netCDF4, maybe a bit faster.

import h5py
fnames = [fname for fname in tqdm(all_fnames) if param in h5py.File(fname).keys()]

Not sure if and when the open file handles go out of scope and are closed fin…

Replies: 2 comments 3 replies

Comment options

You must be logged in to vote
1 reply
@dcherian
Comment options

Comment options

You must be logged in to vote
2 replies
@iuryt
Comment options

@iuryt
Comment options

Answer selected by iuryt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants