-
Notifications
You must be signed in to change notification settings - Fork 13
Enhance print-metadata-fields to group by root JSON and modalities #473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Enhance print-metadata-fields to group by root JSON and modalities #473
Conversation
…tional BIDS datasets
… unique ones The previous implementation filtered to show only fields unique to each category, which caused common fields like 'Manufacturer' to disappear from all modalities. This fixes test failures in test_remove_fields and test_print_metadata_fields_command_with_test_dataset. Now all fields that exist in each modality are shown, which is the expected behavior. Fields are still unique within each modality (no duplication), but fields that appear in multiple modalities are now correctly shown in each one.
The test was pointing to the wrong directory. The get_data() function copies cubids/tests/data/ which contains multiple BIDS datasets (BIDS_Dataset, complete, inconsistent, etc). The test needs to specify which dataset to use, like other tests do with data_root / 'complete'. This fixes the test failure where metadata_fields was empty because no subjects were found at the root level.
The code was incorrectly searching for '_events.json' in root_fields dictionary keys (which are filenames like 'dataset_description.json'). While BIDS does allow task-level files at root (e.g., task-rest_events.json), these are already correctly captured by the bids_path.glob('*.json') operation and stored in root_fields with their full filename as the key. The removed code was: - Searching in the wrong structure (dictionary keys instead of file contents) - Using incorrect pattern matching ('_events.json' in filename) - Completely redundant as root-level JSON files are already properly collected Addresses feedback from cursor bot review.
for mod, fields in modalities.items(): | ||
result[mod] = sorted(fields) | ||
|
||
return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Test Fails to Detect Field Removal
The test_remove_fields
no longer correctly verifies field removal. The get_all_metadata_fields()
method now returns a dictionary, but the test's assertion set(new_fields)
incorrectly creates a set of dictionary keys (filenames/modality names) instead of metadata field names. This causes the intersection check to always pass, making the test report success even if fields were not removed.
Description:
This PR improves the output of the cubids print-metadata-fields command to provide a cleaner, more organized view of metadata fields in a BIDS dataset. Instead of listing all fields from every JSON file individually (which can be overwhelming ), the command now groups fields as follows:
Root-level JSON files (e.g., dataset_description.json, participants.json) are listed with all their metadata fields.
Modalities (e.g., func, anat, dwi) group all nested JSON files by type and display only the metadata fields unique to that modality (fields that don't appear in other categories).
This reduces redundancy—fields like EchoTime that appear across many files are shown only once under their respective modality, avoiding clutter in the output.
Changes Made:
Modified get_all_metadata_fields() in cubids.py to collect and categorize fields by root files and modalities, filtering for uniqueness.
Updated print_metadata_fields() in workflows.py to display the grouped output.
Example Output (Before):
AcquisitionTime
BandwidthPerPixelPhaseEncode
BodyPartExamined
CoilString
ConversionSoftware
ConversionSoftwareVersion
DerivedVendorReportedEchoSpacing
DeviceSerialNumber
EchoTime
EchoTrainLength
EffectiveEchoSpacing
Example Output (After):
dataset_description.json
Acknowledgements
Authors
...
func
EchoTime
RepetitionTime
TaskName
...
participants.json
age
sex
...
This enhancement makes it easier to get an overview of metadata structure without duplication, addressing user feedback for a more structured and unique field listing.
Best, Karl