Replies: 3 comments 5 replies
-
@mariomech I've reproduced this using xarray 0.16.2. The source of the problem is most likely that your source files have different Also your data isn't exported as packed data but as floating point data. I've no idea at which point the wrong decoding takes place but obviously that happens. It seems to work using `combine="nested" though. That sounds like a serious issue. |
Beta Was this translation helpful? Give feedback.
-
the reason for this is >>> ds.t.encoding["dtype"]
dtype('int16')
>>> rs.t.encoding["dtype"]
dtype('float32') I'm not really sure why, though. Maybe someone with more knowledge on how our You can work around this by adding In [2]: import xarray as xr
...:
...: def _remove_z(ds):
...: if 'z' in ds.variables:
...: ds = ds.drop('z')
...: return ds
...:
...:
...: ds = xr.open_mfdataset(
...: ['first.nc', 'second.nc'],
...: decode_times=False,
...: preprocess=_remove_z,
...: combine='by_coords',
...: )
...: del ds.t.encoding["dtype"]
...: ds.to_netcdf("result.nc")
...:
...: with xr.open_dataset('result.nc', decode_times=False) as rs:
...: xr.testing.assert_allclose(ds, rs, atol=1e-7)
...:
In [3]: |
Beta Was this translation helpful? Give feedback.
-
@kmuehlbauer your idea is indeed correct. scale_factor and add_offset are different for the various datasets and xr.merge or xr.open_mfdataset keeps the encoding of only one of them. The workaround suggested by @keewis works great, but it is still only a temporary patch for @mariomech that noticed this problem. Moreover, the drawback is that the variable will be written as float32 instead of int16. It seems that this issue arises from a feature request |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hej,
I try to combine a bunch of ERA5 datasets to a single one with
open_mdfdataset()
and write them to a single netcdf. While the combining seems to work, the methodto_netcdf()
seem sto change some of the values. Before writing the values in the dataset are still as in the origin file.This is what I do:
The variable that changes is 't' the temperature. In the single file
/tmp/second.nc
I haveBut the result after
ds.to_netcdf('/tmp/result.nc')
isSo
changes to
the moment I write to the file.
And for completness: python==3.8.5, xarray==0.15.0
Any help is very much appreciated.
Thank you.
first.nc.gz
second.nc.gz
Beta Was this translation helpful? Give feedback.
All reactions