Replies: 1 comment 1 reply
-
See https://jobqueue.dask.org/en/latest/generated/dask_jobqueue.SLURMCluster.html You'll need something like cluster = dask_jobqueue.SLURMCluster(...)
cluster.scale(4)
client = distributed.Client(cluster) For your notebook, you shouldn't be requesting all the resources you need for your job. That will go through the queue separately. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I have a complex data (Fourier coefficients) that has the following dimensions
(noise, time, kx, ky)=(5, 200, 4096, 4096)
.I want to calculate the lagged time correlation between different noises.
I have a file for each noise that I am loading using
open_mfdataset
givingchunks="auto"
I can't load everything (250Gb) into the memory.
AFAIK,
xr.corr
only works for real variables and does not perform lagged correlation.I could also try to somehow wrap
scipy.signal.correlate
My first try was to run a simple example for the lag 0:
which takes about 17 minutes to run.
What I want is to have a correlation matrix like
(lag, kx, ky)
for each combination I give for different noise.I was thinking on using dask to parallelize and speedup this process, but I am already requesting resources from the cluster I am using for the jupyterlab I am running for this. So I am not sure if the client I create for dask can somehow interact with SLURM for this process or if there are better alternatives with the built-in functions we have for
xarray
.Beta Was this translation helpful? Give feedback.
All reactions