-
Notifications
You must be signed in to change notification settings - Fork 45
Labels
Milestone
Description
The xr.Dataset
constructed by open_virtual_dataset
doesn't seem to correctly identify coordinates when the coordinate has more than one dimension. The bug seems to be in separate_coords
on this line. The correct functionality could be to use the coordinates
attribute within each variables .zattrs
and maintain a set of all coordinate names
Here is a reproducible example:
>>> import xarray as xr
>>> xr.tutorial.open_dataset("ROMS_example.nc")
<xarray.Dataset> Size: 19MB
Dimensions: (ocean_time: 2, s_rho: 30, eta_rho: 191, xi_rho: 371)
Coordinates:
Cs_r (s_rho) float64 240B ...
lon_rho (eta_rho, xi_rho) float64 567kB ...
hc float64 8B ...
h (eta_rho, xi_rho) float64 567kB ...
lat_rho (eta_rho, xi_rho) float64 567kB ...
Vtransform int32 4B ...
* ocean_time (ocean_time) datetime64[ns] 16B 2001-08-01 2001-08-08
* s_rho (s_rho) float64 240B -0.9833 -0.95 -0.9167 ... -0.05 -0.01667
Dimensions without coordinates: eta_rho, xi_rho
Data variables:
salt (ocean_time, s_rho, eta_rho, xi_rho) float32 17MB ...
zeta (ocean_time, eta_rho, xi_rho) float32 567kB ...
Attributes: (12/34)
file: ../output_20yr_obc/2001/ocean_his_0015.nc
format: netCDF-4/HDF5 file
Conventions: CF-1.4
type: ROMS/TOMS history file
title: TXLA ROMS hindcast run with dyes and oxygen
rst_file: ../output_20yr_obc/2001/ocean_rst.nc
... ...
compiler_flags: -heap-arrays -fp-model fast -mt_mpi -ip -O3 -msse2 -free
tiling: 010x012
history: Tue Jul 24 11:04:43 2018: /opt/nco/ncks -D 4 -t 8 /cop...
ana_file: /home/d.kobashi/TXLA_ROMS_reana/Functionals/ana_btflux...
CPP_options: TXLA2, ANA_BPFLUX, ANA_BSFLUX, ANA_BTFLUX, ANA_NUDGCOE...
NCO: netCDF Operators version 4.7.6-alpha04 (Homepage = htt...
% wget https://github.com/pydata/xarray-data/raw/master/ROMS_example.nc
>>> from virtualizarr import open_virtual_dataset
>>> vds = open_virtual_dataset('ROMS_example.nc', indexes={})
>>> vds
<xarray.Dataset> Size: 19MB
Dimensions: (ocean_time: 2, eta_rho: 191, xi_rho: 371, s_rho: 30)
Coordinates:
s_rho (s_rho) float64 240B ManifestArray<shape=(30,), dtype=float64...
ocean_time (ocean_time) float64 16B ManifestArray<shape=(2,), dtype=floa...
Dimensions without coordinates: eta_rho, xi_rho
Data variables:
zeta (ocean_time, eta_rho, xi_rho) float32 567kB ManifestArray<sha...
lon_rho (eta_rho, xi_rho) float64 567kB ManifestArray<shape=(191, 371...
Vtransform int32 4B ManifestArray<shape=(), dtype=int32, chunks=()>
Cs_r (s_rho) float64 240B ManifestArray<shape=(30,), dtype=float64...
hc float64 8B ManifestArray<shape=(), dtype=float64, chunks=()>
lat_rho (eta_rho, xi_rho) float64 567kB ManifestArray<shape=(191, 371...
h (eta_rho, xi_rho) float64 567kB ManifestArray<shape=(191, 371...
salt (ocean_time, s_rho, eta_rho, xi_rho) float32 17MB ManifestArr...
Attributes: (12/34)
CPP_options: TXLA2, ANA_BPFLUX, ANA_BSFLUX, ANA_BTFLUX, ANA_NUDGCOE...
Conventions: CF-1.4
NCO: netCDF Operators version 4.7.6-alpha04 (Homepage = htt...
NLM_LBC: \nEDGE: WEST SOUTH EAST NORTH \nzeta: Che ...
ana_file: /home/d.kobashi/TXLA_ROMS_reana/Functionals/ana_btflux...
avg_base: ../output_20yr_obc/2001/ocean_avg
... ...
sta_file: ocean_sta.nc
svn_rev:
svn_url: https:://myroms.org/svn/src
tiling: 010x012
title: TXLA ROMS hindcast run with dyes and oxygen
type: ROMS/TOMS history file
Note that the underlying kerchunk json does have this coordinate information since when you virtualize
the dataset and materialize data, the coordinates are correct:
>>> refs = vds.virtualize.to_kerchunk(filepath=None, format="dict")
>>> xr.open_dataset("reference://", engine="zarr", chunks={}, backend_kwargs={"storage_options": {"fo": refs, "consolidated": False}})
<xarray.Dataset> Size: 19MB
Dimensions: (s_rho: 30, eta_rho: 191, xi_rho: 371, ocean_time: 2)
Coordinates:
Cs_r (s_rho) float64 240B dask.array<chunksize=(30,), meta=np.ndarray>
Vtransform float64 8B ...
h (eta_rho, xi_rho) float64 567kB dask.array<chunksize=(191, 371), meta=np.ndarray>
hc float64 8B ...
lat_rho (eta_rho, xi_rho) float64 567kB dask.array<chunksize=(191, 371), meta=np.ndarray>
lon_rho (eta_rho, xi_rho) float64 567kB dask.array<chunksize=(191, 371), meta=np.ndarray>
* ocean_time (ocean_time) datetime64[ns] 16B 2001-08-01 2001-08-08
* s_rho (s_rho) float64 240B -0.9833 -0.95 -0.9167 ... -0.05 -0.01667
Dimensions without coordinates: eta_rho, xi_rho
Data variables:
salt (ocean_time, s_rho, eta_rho, xi_rho) float32 17MB dask.array<chunksize=(1, 15, 96, 186), meta=np.ndarray>
zeta (ocean_time, eta_rho, xi_rho) float32 567kB dask.array<chunksize=(1, 191, 371), meta=np.ndarray>
Attributes: (12/34)
CPP_options: TXLA2, ANA_BPFLUX, ANA_BSFLUX, ANA_BTFLUX, ANA_NUDGCOE...
Conventions: CF-1.4
NCO: netCDF Operators version 4.7.6-alpha04 (Homepage = htt...
NLM_LBC: \nEDGE: WEST SOUTH EAST NORTH \nzeta: Che ...
ana_file: /home/d.kobashi/TXLA_ROMS_reana/Functionals/ana_btflux...
avg_base: ../output_20yr_obc/2001/ocean_avg
... ...
sta_file: ocean_sta.nc
svn_rev:
svn_url: https:://myroms.org/svn/src
tiling: 010x012
title: TXLA ROMS hindcast run with dyes and oxygen
type: ROMS/TOMS history file