Skip to content

DOC: derivatives datasets vs derivatives data #2103

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
CPernet opened this issue Apr 16, 2025 · 2 comments
Open

DOC: derivatives datasets vs derivatives data #2103

CPernet opened this issue Apr 16, 2025 · 2 comments
Assignees
Labels
community Issues related to building and supporting the BIDS community documentation

Comments

@CPernet
Copy link
Collaborator

CPernet commented Apr 16, 2025

@robertoostenveld and I have discussed a lack of clarity between derivatives datasets vs derivatives data. The information is there, but we need to think how to explain better.

Example found on OpenNeuro (@effigies )
dataset_description.json - DatasetType set to derivatives
derivatives
|- sub-01
|- sub-02
sub-01
sub-02

This is DatasetType raw but has derivative data. Several datasets are now flagged as derivatives, suggesting we have a communication issue.

Proposal: https://bids-specification.readthedocs.io/en/stable/common-principles.html#storage-of-derived-datasets headers raw dataset with derivative data, derivatives dataset, non compliant derivative data

@CPernet CPernet added community Issues related to building and supporting the BIDS community documentation labels Apr 16, 2025
@yarikoptic
Copy link
Collaborator

Is that to prime discussion on "derivative BIDS" at coming meeting in Copenhagen to keep it a "theme topic"? ;)

I think this is largely a historical artifact of encouraging containing derivative datasets under derivatives/ of raw BIDS. Having said that, there is 1 derivative I started to place under derivatives/ which I think makes total sense to include -- output of bids-validator, which is a derivative worth shipping within "raw" dataset. See e.g. ds005256

But as to me -- "raw" BIDS dataset already 99% of the time contains "derivative" data. The fact that they included some more derivatives on top of the "raw" ones makes it just heavier, and even potentially nesting "derivative" datasets, but overall it remains "raw BIDS dataset" . But indeed we might want to document it better. I am not quite certain what do you mean by "headers" @CPernet?

Also relates to

@CPernet
Copy link
Collaborator Author

CPernet commented Apr 16, 2025

I meant 'titles' in https://bids-specification.readthedocs.io/en/stable/common-principles.html#storage-of-derived-datasets which also appear in the table of content.
What about showing:

  • a raw dataset with sources and derivatives, DatasetType: Raw
  • a derivatives dataset, DatasetType: Derivatives
  • a raw with non compliant derivatives data, DatasetType: Raw

pretty much shows that already, but being more explicit about the content of the dataset_description.json
(i can do it, just want your opinion, then you can review it)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Issues related to building and supporting the BIDS community documentation
Projects
None yet
Development

No branches or pull requests

2 participants