What exactly does open_mfdataset hold in memory? Whatever it is won't fit #6452

openSourcerer9000 · 2022-04-07T14:49:49Z

openSourcerer9000
Apr 7, 2022

So I'm trying to open 800 of these grib2's, which are 700KB compressed and 100MB uncompressed, using

def timeKeepsOnSlipping(ds):
    '''...into the future'''
    return ds.expand_dims(dim='time')

grb = xr.open_mfdataset(gribpths,engine='cfgrib',
    preprocess=timeKeepsOnSlipping,
    chunks = {'latitude':10, 'longitude':10}
    )

After at least several hrs (not sure, I ran it overnight), it just gives me MemoryError. I was under the impression that xarray just lazily created pointers to the data in the files on disk. Is it really trying to read 80GB into memory, exceeding my 32GB RAM? Reading files has always been instant and appeared lazy to me; this is my first experience with open_mfdataset.

How can I process this data then (regionmask, interpolate_na, then serialize to .nc)? Also, if it will take forever to open, how can I at least see its progress? I was trying to sneak a progress bar into the preprocess function, but it hides all the print statements until the it finishes and displays the dataset (using Jupyter extension in vscode FYI).

xr version 0.20.2

Thanks!

openSourcerer9000 · 2022-04-08T14:58:03Z

openSourcerer9000
Apr 8, 2022
Author

So I've since discovered Dask Distributed and life is much better, I'm able to schedule things efficiently and keep the job in memory. I'm still finding it takes 20 sec just to open each little grib, and I have no idea why:

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

What exactly does open_mfdataset hold in memory? Whatever it is won't fit #6452

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Uh oh!

What exactly does open_mfdataset hold in memory? Whatever it is won't fit #6452

Uh oh!

openSourcerer9000 Apr 7, 2022

Replies: 1 comment

Uh oh!

Uh oh!

openSourcerer9000 Apr 8, 2022 Author

openSourcerer9000
Apr 7, 2022

openSourcerer9000
Apr 8, 2022
Author