-
Notifications
You must be signed in to change notification settings - Fork 33
Open
Labels
bugIssues noting problems and PRs fixing those problems.Issues noting problems and PRs fixing those problems.
Description
Summary
When running XCP-D on a dataset from our 7 T scanner, the job times out with a BrokenProcessPool
exception after a memory error in the make_qc_plots_es
workflow.
Additional details
- xcp_d version: 0.9.1
- Apptainer version: 1.4.0-1.el8
This was running on an HPC node with 12 threads and 240GB of RAM allocated for the job by slurm.
What were you trying to do?
Running XCP-D for resting-state functional connectivity, ReHo, and ALFF with a 1mm resolution MNI space preprocessed BOLD image from fMRIPrep.
What did you expect to happen?
XCP-D completes as it does for other data from this project.
What actually happened?
The job starts as expected, with the following:
Framewise displacement-based scrubbing is disabled. The following parameters will have no effect:
--min-time
250923-16:39:10,62 nipype.workflow IMPORTANT:
Running XCP-D version 0.9.1
250923-16:39:10,128 nipype.workflow WARNING:
Previous output generated by version 0+unknown found.
250923-16:39:11,279 nipype.workflow IMPORTANT:
Building XCP-D's workflow:
* Preprocessing derivatives path: /data/bids/derivatives/fmriprep.
* Participant list: ['CUPS003'].
* Run identifier: 20250923-163831_123664ac-7b4c-4a1f-8c40-bce334b70aa1.
250923-16:39:12,882 nipype.utils IMPORTANT:
Collected data:
anat_brainmask: /data/bids/derivatives/fmriprep/sub-CUPS003/ses-A/anat/sub-CUPS003_ses-A_acq-mp2rageunidenoised_space-MNI152NLin2009cAsym_res-1_desc-brain_mask.nii.gz
anat_to_template_xfm: /data/bids/derivatives/fmriprep/sub-CUPS003/ses-A/anat/sub-CUPS003_ses-A_acq-mp2rageunidenoised_from-T1w_to-MNI152NLin2009cAsym_mode-image_xfm.h5
bold:
- /data/bids/derivatives/fmriprep/sub-CUPS003/ses-A/func/sub-CUPS003_ses-A_task-rest_dir-PA_run-1_space-MNI152NLin2009cAsym_res-1_desc-preproc_bold.nii.gz
t1w: /data/bids/derivatives/fmriprep/sub-CUPS003/ses-A/anat/sub-CUPS003_ses-A_acq-mp2rageunidenoised_desc-preproc_T1w.nii.gz
t2w: null
template_to_anat_xfm: /data/bids/derivatives/fmriprep/sub-CUPS003/ses-A/anat/sub-CUPS003_ses-A_acq-mp2rageunidenoised_from-MNI152NLin2009cAsym_to-T1w_mode-image_xfm.h5
250923-16:39:13,45 nipype.utils INFO:
No standard-space surfaces found.
250923-16:39:13,498 nipype.utils IMPORTANT:
Collected mesh files:
lh_pial_surf: /data/bids/derivatives/fmriprep/sub-CUPS003/ses-A/anat/sub-CUPS003_ses-A_acq-mp2rageunidenoised_hemi-L_pial.surf.gii
lh_subject_sphere: null
lh_wm_surf: /data/bids/derivatives/fmriprep/sub-CUPS003/ses-A/anat/sub-CUPS003_ses-A_acq-mp2rageunidenoised_hemi-L_smoothwm.surf.gii
rh_pial_surf: /data/bids/derivatives/fmriprep/sub-CUPS003/ses-A/anat/sub-CUPS003_ses-A_acq-mp2rageunidenoised_hemi-R_pial.surf.gii
rh_subject_sphere: null
rh_wm_surf: /data/bids/derivatives/fmriprep/sub-CUPS003/ses-A/anat/sub-CUPS003_ses-A_acq-mp2rageunidenoised_hemi-R_smoothwm.surf.gii
250923-16:39:13,811 nipype.utils IMPORTANT:
Collected morphometry files:
cortical_thickness: null
cortical_thickness_corr: null
myelin: null
myelin_smoothed: null
sulcal_curv: null
sulcal_depth: null
250923-16:39:25,291 nipype.utils IMPORTANT:
Collected run data for sub-CUPS003_ses-A_task-rest_dir-PA_run-1_space-MNI152NLin2009cAsym_res-1_desc-preproc_bold.nii.gz:
boldmask: /data/bids/derivatives/fmriprep/sub-CUPS003/ses-A/func/sub-CUPS003_ses-A_task-rest_dir-PA_run-1_space-MNI152NLin2009cAsym_res-1_desc-brain_mask.nii.gz
boldref: /data/bids/derivatives/fmriprep/sub-CUPS003/ses-A/func/sub-CUPS003_ses-A_task-rest_dir-PA_run-1_space-MNI152NLin2009cAsym_res-1_boldref.nii.gz
confounds: /data/bids/derivatives/fmriprep/sub-CUPS003/ses-A/func/sub-CUPS003_ses-A_task-rest_dir-PA_run-1_desc-confounds_timeseries.tsv
confounds_json: /data/bids/derivatives/fmriprep/sub-CUPS003/ses-A/func/sub-CUPS003_ses-A_task-rest_dir-PA_run-1_desc-confounds_timeseries.json
250923-16:39:33,45 nipype.workflow INFO:
XCP-D workflow graph with 131 nodes built successfully.
250923-16:39:36,918 nipype.workflow INFO:
Generated workflow graph: /sing_scratch/xcp_d_0_9_wf/graph.svg (graph2use=colored, simple_form=True).
250923-16:39:55,989 nipype.workflow VERBOSE:
XCP-D config:
[environment]
cpu_count = 128
exec_env = "posix"
free_mem = 232.8
overcommit_policy = "heuristic"
overcommit_limit = "50%"
nipype_version = "1.8.6"
templateflow_version = "24.2.0"
version = "0.9.1"
[execution]
fmri_dir = "/data/bids/derivatives/fmriprep"
aggr_ses_reports = 4
bids_database_dir = "/sing_scratch/20250923-163831_123664ac-7b4c-4a1f-8c40-bce334b70aa1/bids_db"
bids_description_hash = "f347f1dc629335f5a4e6cd510475f9ae41e960bf25370631a04eca601bbf9638"
boilerplate_only = false
debug = []
xcp_d_dir = "/data/bids/derivatives/xcp_d"
fs_license_file = "/imgdir/license.txt"
layout = "BIDS Layout: ...data/bids/derivatives/fmriprep | Subjects: 1 | Sessions: 1 | Runs: 1"
log_dir = "/data/bids/derivatives/xcp_d/logs"
log_level = 15
low_mem = false
md_only_boilerplate = false
notrack = true
reports_only = false
output_dir = "/data/bids/derivatives/xcp_d"
atlases = [ "4S156Parcels", "4S256Parcels", "4S356Parcels", "4S456Parcels", "4S556Parcels", "4S656Parcels", "Glasser", "Gordon",]
run_uuid = "20250923-163831_123664ac-7b4c-4a1f-8c40-bce334b70aa1"
participant_label = [ "CUPS003",]
templateflow_home = "/imgdir/templateflow"
work_dir = "/sing_scratch"
write_graph = true
[workflow]
mode = "none"
file_format = "nifti"
dummy_scans = 0
input_type = "fmriprep"
despike = false
params = "aroma"
smoothing = 5.0
output_interpolated = true
output_correlations = true
combine_runs = false
motion_filter_order = 4
head_radius = 50
fd_thresh = 0.0
min_time = 0
bandpass_filter = true
high_pass = 0.01
low_pass = 0.08
bpf_order = 2
min_coverage = 0.5
dcan_correlation_lengths = []
process_surfaces = false
abcc_qc = true
linc_qc = true
[nipype]
crashfile_format = "txt"
get_linked_libs = false
memory_gb = 240
nprocs = 12
omp_nthreads = 3
plugin = "MultiProc"
resource_monitor = false
stop_on_first_crash = false
[seeds]
master = 61313
[nipype.plugin_args]
maxtasksperchild = 1
raise_insufficient = false
250923-16:39:55,992 nipype.workflow IMPORTANT:
XCP-D started!
250923-16:39:56,111 nipype.workflow INFO:
Workflow xcp_d_0_9_wf settings: ['check', 'execution', 'logging', 'monitoring']
250923-16:39:56,338 nipype.workflow INFO:
Running in parallel.
The first error occurs here:
250923-16:55:58,319 nipype.workflow INFO:
[Node] Setting-up "xcp_d_0_9_wf.sub_CUPS003_wf.postprocess_0_wf.qc_report_wf.make_qc_plots_es" in "/sing_scratch/xcp_d_0_9_wf/sub_CUPS003_wf/postprocess_0_wf/qc_report_wf/make_qc_plots_$
250923-16:55:58,352 nipype.workflow INFO:
[Node] Executing "nifti_smoothing" <xcp_d.interfaces.nilearn.Smooth>
250923-16:55:58,458 nipype.workflow INFO:
[Node] Executing "ds_denoised_bold" <xcp_d.interfaces.bids.DerivativesDataSink>
250923-16:55:58,738 nipype.workflow INFO:
[Node] Executing "censor_interpolated_data" <xcp_d.interfaces.censoring.Censor>
250923-16:55:58,745 nipype.workflow INFO:
[Node] Finished "censor_interpolated_data", elapsed time 0.005265s.
250923-16:55:59,165 nipype.workflow INFO:
[Node] Executing "make_qc_plots_es" <xcp_d.interfaces.plotting.QCPlotsES>
250923-16:55:59,192 nipype.workflow INFO:
[Node] Executing "alff_compt" <xcp_d.interfaces.restingstate.ComputeALFF>
250923-16:59:04,598 nipype.workflow INFO:
[Node] Finished "make_qc_plots_es", elapsed time 185.150693s.
250923-16:59:04,601 nipype.workflow WARNING:
Storing result file without outputs
250923-16:59:04,620 nipype.workflow WARNING:
[Node] Error on "xcp_d_0_9_wf.sub_CUPS003_wf.postprocess_0_wf.qc_report_wf.make_qc_plots_es" (/sing_scratch/xcp_d_0_9_wf/sub_CUPS003_wf/postprocess_0_wf/qc_report_wf/make_qc_plots_es)
250923-17:11:08,224 nipype.workflow INFO:
[Node] Finished "ds_denoised_bold", elapsed time 909.765025s.
250923-17:11:08,355 nipype.workflow INFO:
[Job 54] Completed (xcp_d_0_9_wf.sub_CUPS003_wf.postprocess_0_wf.postproc_derivatives_wf.ds_denoised_bold).
250923-17:11:09,617 nipype.workflow INFO:
[Job 36] Completed (xcp_d_0_9_wf.sub_CUPS003_wf.postprocess_0_wf.denoise_bold_wf.censor_interpolated_data).
250923-17:11:09,619 nipype.workflow ERROR:
Node make_qc_plots_es failed to run on host cn121.delta.ncsa.illinois.edu.
250923-17:11:09,712 nipype.workflow ERROR:
Saving crash info to /data/bids/derivatives/xcp_d/sub-CUPS003/log/20250923-163831_123664ac-7b4c-4a1f-8c40-bce334b70aa1/crash-20250923-171109-pcamach2-make_qc_plots_es-ac3a5252-af5b-49c8$
Traceback (most recent call last):
File "/usr/local/miniconda/lib/python3.10/site-packages/nipype/pipeline/plugins/multiproc.py", line 67, in run_node
result["result"] = node.run(updatehash=updatehash)
File "/usr/local/miniconda/lib/python3.10/site-packages/nipype/pipeline/engine/nodes.py", line 527, in run
result = self._run_interface(execute=True)
File "/usr/local/miniconda/lib/python3.10/site-packages/nipype/pipeline/engine/nodes.py", line 645, in _run_interface
return self._run_command(execute)
File "/usr/local/miniconda/lib/python3.10/site-packages/nipype/pipeline/engine/nodes.py", line 771, in _run_command
raise NodeExecutionError(msg)
nipype.pipeline.engine.nodes.NodeExecutionError: Exception raised while executing Node make_qc_plots_es.
Traceback:
Traceback (most recent call last):
File "/usr/local/miniconda/lib/python3.10/site-packages/nipype/interfaces/base/core.py", line 397, in run
runtime = self._run_interface(runtime)
File "/usr/local/miniconda/lib/python3.10/site-packages/xcp_d/interfaces/plotting.py", line 459, in _run_interface
self._results["before_process"], self._results["after_process"] = plot_fmri_es(
File "/usr/local/miniconda/lib/python3.10/site-packages/xcp_d/utils/plotting.py", line 531, in plot_fmri_es
denoised_interpolated_arr = read_ndata(datafile=denoised_interpolated_bold, maskfile=mask)
File "/usr/local/miniconda/lib/python3.10/site-packages/xcp_d/utils/write_save.py", line 42, in read_ndata
data = masking.apply_mask(datafile, maskfile)
File "/usr/local/miniconda/lib/python3.10/site-packages/nilearn/masking.py", line 809, in apply_mask
return apply_mask_fmri(
File "/usr/local/miniconda/lib/python3.10/site-packages/nilearn/masking.py", line 835, in apply_mask_fmri
imgs_img = _utils.check_niimg(imgs)
File "/usr/local/miniconda/lib/python3.10/site-packages/nilearn/_utils/niimg_conversions.py", line 315, in check_niimg
niimg = load_niimg(niimg, dtype=dtype)
File "/usr/local/miniconda/lib/python3.10/site-packages/nilearn/_utils/niimg.py", line 135, in load_niimg
dtype = _get_target_dtype(_get_data(niimg).dtype, dtype)
File "/usr/local/miniconda/lib/python3.10/site-packages/nilearn/_utils/niimg.py", line 25, in _get_data
data = np.asanyarray(img._dataobj)
File "/usr/local/miniconda/lib/python3.10/site-packages/nibabel/arrayproxy.py", line 457, in __array__
arr = self._get_scaled(dtype=dtype, slicer=())
File "/usr/local/miniconda/lib/python3.10/site-packages/nibabel/arrayproxy.py", line 424, in _get_scaled
scaled = apply_read_scaling(self._get_unscaled(slicer=slicer), scl_slope, scl_inter)
File "/usr/local/miniconda/lib/python3.10/site-packages/nibabel/arrayproxy.py", line 394, in _get_unscaled
return array_from_file(
File "/usr/local/miniconda/lib/python3.10/site-packages/nibabel/volumeutils.py", line 464, in array_from_file
data_bytes = bytearray(n_bytes)
MemoryError
250923-17:11:09,729 nipype.workflow INFO:
[MultiProc] Running 2 tasks, and 12 jobs ready. Free memory (GB): 229.00/240.00, Free processors: 8/12.
Currently running:
* xcp_d_0_9_wf.sub_CUPS003_wf.postprocess_0_wf.alff_wf.alff_compt
* xcp_d_0_9_wf.sub_CUPS003_wf.postprocess_0_wf.denoise_bold_wf.resd_smoothing_wf.nifti_smoothing
250923-17:11:09,820 nipype.workflow INFO:
[Node] Setting-up "xcp_d_0_9_wf.sub_CUPS003_wf.postprocess_0_wf.reho_nifti_wf.reho_3d" in "/sing_scratch/xcp_d_0_9_wf/sub_CUPS003_wf/postprocess_0_wf/reho_nifti_wf/reho_3d".
250923-17:11:09,827 nipype.workflow INFO:
[Node] Setting-up "xcp_d_0_9_wf.sub_CUPS003_wf.postprocess_0_wf.qc_report_wf.make_linc_qc" in "/sing_scratch/xcp_d_0_9_wf/sub_CUPS003_wf/postprocess_0_wf/qc_report_wf/make_linc_qc".
250923-17:11:09,828 nipype.workflow INFO:
[Node] Setting-up "xcp_d_0_9_wf.sub_CUPS003_wf.postprocess_0_wf.qc_report_wf.make_qc_plots_nipreps" in "/sing_scratch/xcp_d_0_9_wf/sub_CUPS003_wf/postprocess_0_wf/qc_report_wf/make_qc_p$
250923-17:11:09,890 nipype.workflow INFO:
[Node] Setting-up "_parcellate_data0" in "/sing_scratch/xcp_d_0_9_wf/sub_CUPS003_wf/postprocess_0_wf/connectivity_wf/parcellate_data/mapflow/_parcellate_data0".
250923-17:11:09,891 nipype.workflow INFO:
[Node] Setting-up "_parcellate_data1" in "/sing_scratch/xcp_d_0_9_wf/sub_CUPS003_wf/postprocess_0_wf/connectivity_wf/parcellate_data/mapflow/_parcellate_data1".
250923-17:11:09,892 nipype.workflow INFO:
[Node] Setting-up "_parcellate_data2" in "/sing_scratch/xcp_d_0_9_wf/sub_CUPS003_wf/postprocess_0_wf/connectivity_wf/parcellate_data/mapflow/_parcellate_data2".
250923-17:11:09,893 nipype.workflow INFO:
[Node] Setting-up "_parcellate_data3" in "/sing_scratch/xcp_d_0_9_wf/sub_CUPS003_wf/postprocess_0_wf/connectivity_wf/parcellate_data/mapflow/_parcellate_data3".
250923-17:11:09,894 nipype.workflow INFO:
[Node] Setting-up "_parcellate_data4" in "/sing_scratch/xcp_d_0_9_wf/sub_CUPS003_wf/postprocess_0_wf/connectivity_wf/parcellate_data/mapflow/_parcellate_data4".
Processing continues, but eventually stalls out with the following repeating until the job reaches its 8-hour time limit:
250923-17:11:22,581 nipype.interface WARNING:
85/556 of parcels have at least one uncovered voxel, but have enough good voxels to be useable. The bad voxels will be ignored and the parcels' time series will be calculated from the r$
** AFNI converts NIFTI_datatype=64 (FLOAT64) in file /sing_scratch/xcp_d_0_9_wf/sub_CUPS003_wf/postprocess_0_wf/reho_nifti_wf/reho_3d/inset.nii.gz to FLOAT32
Warnings of this type will be muted for this session.
Set AFNI_NIFTI_TYPE_WARN to YES to see them all, NO to see none.
exception calling callback for <Future at 0x7fecf7e87640 state=finished raised BrokenProcessPool>
Traceback (most recent call last):
File "/usr/local/miniconda/lib/python3.10/concurrent/futures/_base.py", line 342, in _invoke_callbacks
callback(self)
File "/usr/local/miniconda/lib/python3.10/site-packages/nipype/pipeline/plugins/multiproc.py", line 159, in _async_callback
result = args.result()
File "/usr/local/miniconda/lib/python3.10/concurrent/futures/_base.py", line 451, in result
return self.__get_result()
File "/usr/local/miniconda/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
exception calling callback for <Future at 0x7fecf7e878b0 state=finished raised BrokenProcessPool>
Traceback (most recent call last):
File "/usr/local/miniconda/lib/python3.10/concurrent/futures/_base.py", line 342, in _invoke_callbacks
callback(self)
File "/usr/local/miniconda/lib/python3.10/site-packages/nipype/pipeline/plugins/multiproc.py", line 159, in _async_callback
result = args.result()
File "/usr/local/miniconda/lib/python3.10/concurrent/futures/_base.py", line 451, in result
return self.__get_result()
File "/usr/local/miniconda/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/usr/local/miniconda/lib/python3.10/concurrent/futures/_base.py", line 342, in _invoke_callbacks
callback(self)
File "/usr/local/miniconda/lib/python3.10/site-packages/nipype/pipeline/plugins/multiproc.py", line 159, in _async_callback
result = args.result()
File "/usr/local/miniconda/lib/python3.10/concurrent/futures/_base.py", line 451, in result
return self.__get_result()
File "/usr/local/miniconda/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
exception calling callback for <Future at 0x7fecfc60afb0 state=finished raised BrokenProcessPool>
Traceback (most recent call last):
File "/usr/local/miniconda/lib/python3.10/concurrent/futures/_base.py", line 342, in _invoke_callbacks
callback(self)
File "/usr/local/miniconda/lib/python3.10/site-packages/nipype/pipeline/plugins/multiproc.py", line 159, in _async_callback
result = args.result()
File "/usr/local/miniconda/lib/python3.10/concurrent/futures/_base.py", line 451, in result
return self.__get_result()
File "/usr/local/miniconda/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/usr/local/miniconda/lib/python3.10/concurrent/futures/_base.py", line 342, in _invoke_callbacks
callback(self)
File "/usr/local/miniconda/lib/python3.10/site-packages/nipype/pipeline/plugins/multiproc.py", line 159, in _async_callback
result = args.result()
File "/usr/local/miniconda/lib/python3.10/concurrent/futures/_base.py", line 451, in result
return self.__get_result()
File "/usr/local/miniconda/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/usr/local/miniconda/lib/python3.10/concurrent/futures/_base.py", line 342, in _invoke_callbacks
callback(self)
File "/usr/local/miniconda/lib/python3.10/site-packages/nipype/pipeline/plugins/multiproc.py", line 159, in _async_callback
result = args.result()
File "/usr/local/miniconda/lib/python3.10/concurrent/futures/_base.py", line 451, in result
return self.__get_result()
File "/usr/local/miniconda/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
exception calling callback for <Future at 0x7fecf66da4a0 state=finished raised BrokenProcessPool>
Reproducing the bug
APPTAINER_CACHEDIR=${CACHESING} APPTAINER_TMPDIR=${TMPSING} apptainer run \
--cleanenv --containall --no-home --bind ${IMAGEDIR}:/imgdir,${TMPSING}:/sing_scratch \
--bind ${projDir}:/data ${IMAGEDIR}/xcp_d-v0.9.1.sif \
--participant-label CUPS003 --nthreads 12 \
--omp-nthreads 3 --mem-gb 240 \
--input-type fmriprep --smoothing 5 -p aroma} \
--motion-filter-type none \
--atlases 4S156Parcels 4S256Parcels 4S356Parcels 4S456Parcels 4S556Parcels 4S656Parcels Glasser Gordon \
--combine-runs n --despike n \
--file-format nifti --linc-qc y --min-coverage 0.5 --output-type interpolated \
--warp-surfaces-native2std n --abcc-qc y \
--lower-bpf 0.01 --upper-bpf 0.08 --bpf-order 2 \
--notrack --write-graph -vv --create-matrices all --low-mem \
--mode none -f 0 -w /sing_scratch --notrack --fs-license-file /imgdir/license.txt \
/data/bids/derivatives/fmriprep /data/bids/derivatives/xcp_d participant
Metadata
Metadata
Assignees
Labels
bugIssues noting problems and PRs fixing those problems.Issues noting problems and PRs fixing those problems.