Skip to content
Mauricio DIAZ edited this page May 17, 2019 · 12 revisions

Data handling tools

This page describes data handling tools provided by Clinica for BIDS and CAPS compliant datasets. These tools provide easy interaction mechanisms with datasets, including generating subject lists or merging all tabular data into a single TSV for analysis with external statistical software tools.

##create-subjects-visits - Generate the list all subjects and visits of a given dataset A TSV file with two columns (participant_id and session_id) containing the list of visits for each subject can be created as follows:

clinica iotools create-subjects-visits bids_directory output_tsv

where:

  • bids_directory: input folder of a BIDS compliant dataset,
  • output_tsv: output TSV file containing the subjects with their sessions.

Here is an example of the file generated by this tool:

participant_id   session_id
sub-01           ses-M00
sub-02           ses-M24
sub-03           ses-M24
...

!!! note The format of the participant ID and the session ID follows the BIDS standard.

Example:

clinica iotools create-subjects-visits /home/ADNI_BIDS/ adni_participants.tsv

##check-missing-modalities - Check missing modalities for each subject Starting from a BIDS compliant dataset, this command creates:

  1. A TSV file for each session available with the list of the modalities found for each subject. The name of the files produced will be <prefix>_ses-<session_label>.tsv.
  2. A text file containing the number and the percentage of modalities missing for each session. The name of the files produced will be <prefix>_summary.txt.

If no value for <prefix> is specified by the user, the default will be missing_mods.

clinica iotools check-missing-modalities  bids_directory output_directory [-op]

where:

  • bids_directory: input folder of a BIDS compliant dataset
  • output_directory: output folder
  • -op / --output_prefix (Optional): prefix used for the name of the output files. If not specified the default value will be missing_mods

If, for example, only the session M00 is available and the parameter -op is not specified, the command will create the files:

  • missing_mods_ses-M00.tsv
  • missing_mods_summary.txt.

The content of missing_mods_ses-M00.tsv will look like:

participant_id   T1w   DWI
sub-01           1       1
sub-02           1       0
sub-03           1       0

Where the column participant_id contains all the subjects found and the following columns correspond to the list of all the modalities available for the given dataset. The availability is expressed by a boolean value.

The nomenclature of the modalities tries to follow, as much as possible, the one proposed by the BIDS standard.

Examples:

clinica iotools check-missing-modalities /Home/ADNI_BIDS/ /Home/
clinica iotools check-missing-modalities /Home/ADNI_BIDS/ /Home/ -op new_name

##merge-tsv - Gather BIDS and CAPS data into a single TSV file BIDS and CAPS datasets are composed of multiple TSV files for the different subjects and sessions. While this has some advantages, it may not be convenient when performing statistical analyses (with external statistical software tools for instance). This command merges all the TSV files into a single larger TSV file and can be run with the following command line:

clinica iotools merge-tsv bids_directory  output_tsv

where:

  • bids_directory is the input folder containing the dataset in a BIDS hierarchy.
  • output_tsv is the path of the output tsv. If a directory is specified instead of a file name, the default name for the file created will be merge-tsv.tsv.

The optional arguments allow the user to also merge data from a CAPS directory, which will be concatenated to the BIDS summary. The main optional arguments are the following:

  • -caps: input folder of a CAPS compliant dataset

If a CAPS folder is given, data generated by the pipelines of Clinica (regional measures) will be merged to the output file, and a summary file containing the names of the atlases merged will be generated in the same folder.

  • -tsv: input list of subjects and sessions

If an input list of subjects and sessions is given, the merged file will only gather information from the pairs of subjects and sessions specified.

Example:

clinica iotools merge-tsv /Home/ADNI_BIDS /Home/merge-tsv.tsv -caps /Home/ADNI_CAPS -tsv /Home/list_subjects.tsv

The output file will contain one row for each visit:

participant_id   session_id   date_of_birth   ...   ..._ROI-0   ..._ROI-1  ...
sub-01           ses-M00      25/04/41        ...   9.824750    0.023562
sub-01           ses-M18      25/04/41        ...   8.865353    0.012349
sub-02           ses-M00      09/01/91        ...   9.586342    0.027254
...
Clone this wiki locally