Skip to content

Repo for the ontology-based rare disease common data model harmonising international registry use, FHIR, and Phenopackets

License

Notifications You must be signed in to change notification settings

BIH-CEI/rd-cdm

Repository files navigation

ontology-based rare disease common data model

Welcome to the repo of the ontology-based rare disease common data model (RD-CDM) harmonising international registry use, HL7® FHIR®, and the GA4GH Phenopacket Schema.

CI Documentation Status DOI Python Versions PyPI Downloads LinkML

Latest docs: https://rd-cdm.readthedocs.io/en/latest/

Manuscript

The corresponding paper for RD-CDM v2.0.0 has been published in Nature Scientific Data:
https://www.nature.com/articles/s41597-025-04558-z


Table of Contents


Project Description

The ontology-based RD-CDM harmonizes rare disease data capture across registries. It integrates ERDRI-CDS, HL7 FHIR, and GA4GH Phenopacket Schema to support interoperable data for research and care. RD-CDM v2.0.x comprises 78 data elements covering formal criteria, personal information, patient status, disease, genetic findings, phenotypic findings, and family history.


What you get from PyPI

Installing rd-cdm from PyPI provides:

  • Schema

    • src/rd_cdm/schema/rd_cdm.yaml
  • Versioned instances (data packs)

    • src/rd_cdm/instances/v2_0_1/*.yaml (e.g., code_systems.yaml, data_elements.yaml, value_sets.yaml)
    • merged file: src/rd_cdm/instances/v2_0_1/rd_cdm_v2_0_1.yaml
    • exports (if present or generated locally):
      • src/rd_cdm/instances/v2_0_1/jsons/*.json
      • src/rd_cdm/instances/v2_0_1/csvs/*.csv
  • Generated Python & Pydantic classes (LinkML)

    • src/rd_cdm/python_classes/rd_cdm.py (LinkML runtime dataclasses)
    • src/rd_cdm/python_classes/rd_cdm_pydantic.py (generated from the schema via LinkML’s Pydantic generator)
  • Utilities / CLI entry points

    • rdcdm-merge – merge instance parts into rd_cdm_vX_Y_Z.yaml
    • rdcdm-json – per-file JSON export + combined rd_cdm_vX_Y_Z.json
    • rdcdm-csv – per-file CSV export + combined rd_cdm_vX_Y_Z.csv
    • rdcdm-validate – validate ontology codes via BioPortal

Features

  • Interoperability: Aligns with HL7 FHIR v4.0.1 and GA4GH Phenopacket v2.0
  • Ontology-driven: Uses SNOMED CT, LOINC, NCIT, MONDO, OMIM, HPO, and more
  • Modular: Clear separation of schema, instances, and exports
  • Versioned data: Instances shipped and resolved per version (e.g., v2_0_1)
  • Tooling: Merge, export, and validation utilities with simple CLIs
  • (Optional) Pydantic models: Strict runtime validation generated from LinkML

Installation

From PyPI:

pip install rd-cdm

Optional extras for testing/docs:

pip install rd-cdm[test]     # pytest, etc.
# or
pip install rd-cdm[docs]

Development install

git clone https://github.com/BIH-CEI/rd-cdm.git
cd rd-cdm
# (Recommended) create a venv
python -m venv .venv && source .venv/bin/activate
pip install -U pip
pip install -e .[test]
pytest -q

We use a src/ layout. If you run tools directly, ensure PYTHONPATH=src or use the installed CLI entry points shown below.


CLI tools

After installation you should have these commands:

# Merge the versioned parts into rd_cdm_vX_Y_Z.yaml (auto-resolves latest if not given)
rdcdm-merge                  # or: rdcdm-merge --version 2.0.1

# Export JSON (per-file .json + combined rd_cdm_vX_Y_Z.json)
rdcdm-json                   # or: rdcdm-json -v 2.0.1

# Export CSV (per-file .csv + combined rd_cdm_vX_Y_Z.csv)
rdcdm-csv                    # or: rdcdm-csv -v 2.0.1

# Validate merged instance file against ontologies via BioPortal
rdcdm-validate               # or: rdcdm-validate -v 2.0.1 (Note: set up BioPortal API key for this)

BioPoratal API Key Setup for Validation

The rdcdm-validate command uses the BioPortal API to check ontology term validity. This requires an API key to be set as an environment variable.

Get an API key:

Sign up (or log in) at https://bioportal.bioontology.org/accounts/new

  • Go to your account settings and copy your API Key.
  • Set the API key in your environment

macOS / Linux (bash/zsh):

export BIOPORTAL_API_KEY="your-key-here"

Windoes (PowerShell):

setx BIOPORTAL_API_KEY "your-key-here"

Contributing and Contact

The RD-CDM is a community-driven effort and we invite open and international collaboration. Please feel free to create issues, discuss features, or submit pull requests to help enhance this project. For larger contributions, consider reaching out to discuss collaboration opportunities. Please find more information on how to contact us and contribute in the Contribution section of our documentation.

RareLink

RareLink is a novel rare disease framework in REDCap linking international registries, FHIR, and Phenopackets based on the RD-CDM. It is designed to support the collection of harmonized data for rare disease research across any REDCap project worldwide and allows for the preconfigured export of the RD-CDM data in FHIR and Phenopackets formats.

For more information on RareLink, please see the:

Resources

Ontologies

  • Human Phenotype Ontology 🔗
  • Monarch Initiative Disease Ontology 🔗
  • Online Mendelian Inheritance in Man 🔗
  • Orphanet Rare Disease Ontology 🔗
  • SNOMED CT 🔗
  • ICD 11 🔗
  • ICD10CM 🔗
  • National Center for Biotechnology Information Taxonomy 🔗
  • Logical Observation Identifiers Names and Codes 🔗
  • HUGO Gene Nomenclature Committee 🔗
  • Gene Ontology 🔗
  • NCI Thesaurus OBO Edition 🔗

For the versions used in a specific RD-CDM version, please see the resources in our documentation.

Submodules

License

This project is licensed under the terms of the MIT License

Citing

If you use the model for your research, do not hesitate to reach out and please cite our article:

Graefe, A.S.L., Hübner, M.R., Rehburg, F. et al. An ontology-based rare disease common data model harmonising international registries, FHIR, and Phenopackets. Sci Data 12, 234 (2025). https://doi.org/10.1038/s41597-025-04558-z

Acknowledgements

We would like to extend our thanks to all the authors involved in the development of this RD-CDM model.


About

Repo for the ontology-based rare disease common data model harmonising international registry use, FHIR, and Phenopackets

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages