Replies: 1 comment 3 replies
-
With the single machine scheduler dask can optionally cache: https://docs.dask.org/en/stable/caching.html. This won't persist across sessions. You could also try https://github.com/ncar-xdev/xpersist |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I tried searching for this, but I'm new to the community, and I'm not even sure what term we would use for this behaviour. If I've missed something obvious please let me know!
Say I have a computation specified on an xarray dataset backed by a zarr store.
At this point, nothing is computed, right? And as I understand it, the usual pattern for using this
rs
object is to either compute the whole object and persist it, or to use as the input for future computations that may eventually apply a filter, and that filter can get propagated up to ensure that only the appropriate chunks get opened from the s3 store.Can I do something in between? Can I request that
rs
be "in principle" backed by a different remote zarr store, and when some child computation ofrs
applies a filter and forces a computation, if that second store doesn't contain the necessary data, to compute it in this fashion and also store it?Beta Was this translation helpful? Give feedback.
All reactions