MedAlign is a benchmark dataset of 983 clinician-curated natural language instructions grounded in 275 longitudinal EHRs. It includes 303 reference responses to support evaluation of large language models (LLMs) on clinical reasoning, timeline understanding, and multi-document synthesis.
Warning
MedAlign is a test-only benchmark. It is not intended for supervised model training.
The benchmark includes:
Component | Count |
---|---|
Patients | 275 |
Clinical notes | 46,252 |
Distinct note types | 128 |
Clinical events (OMOP format) | 3.6 million |
Instructions (deduplicated) | 983 |
Reference responses | 303 |
All EHR data is longitudinal and standardized using the OMOP Common Data Model (CDM).
Caution
Violations of the data use agreement will result in revoked access and may trigger institutional reporting.
- ❌ Do not train or fine-tune models on MedAlign data (evaluation only)
- ❌ Do not transmit MedAlign data to commercial APIs (e.g., ChatGPT, Claude, Gemini) that are not HIPAA compliant.
- ❌ Do not redistribute dataset files or any derivative datasets.
- ✅ Derived research artifacts (e.g., annotations, synthetic data) must be hosted on Redivis with prior approval from the MedAlign team.
All usage must strictly comply with the MedAlign DUA.
To gain access, please complete the following steps:
-
Apply via Redivis Portal
Use an academic, government, or industry research email. Applications from personal (e.g., Gmail) accounts will be rejected. -
Complete HIPAA-compliant CITI training
You must include proof of training in your application. -
Describe your research use case
A short paragraph outlining your intended use is sufficient. -
Sign the MedAlign Data Use Agreement (DUA)
This will be sent to you after your application is reviewed. -
Verify encryption and secure storage
You must attest to storing the data on encrypted, access-controlled machines. Cloud use requires HIPAA compliance.
⏱️ Applications are reviewed within 7–10 business days.
If you use MedAlign in your work, please cite:
@inproceedings{DBLP:conf/aaai/FlemingLHJRTBGS24,
author = {Scott L. Fleming and Alejandro Lozano and William J. Haberkorn and Jenelle A. Jindal and Eduardo Reis and Rahul Thapa and Louis Blankemeier and Julian Z. Genkins and Ethan Steinberg and Ashwin Nayak and Birju S. Patel and Chia{-}Chun Chiang and Alison Callahan and Zepeng Huo and Sergios Gatidis and Scott J. Adams and Oluseyi Fayanju and Shreya J. Shah and Thomas Savage and Ethan Goh and Akshay S. Chaudhari and Nima Aghaeepour and Christopher D. Sharp and Michael A. Pfeffer and Percy Liang and Jonathan H. Chen and Keith E. Morse and Emma P. Brunskill and Jason A. Fries and Nigam H. Shah},
title = {MedAlign: {A} Clinician-Generated Dataset for Instruction Following with Electronic Medical Records},
booktitle = {Thirty-Eighth {AAAI} Conference on Artificial Intelligence},
year = {2024},
url = {https://doi.org/10.1609/aaai.v38i20.30205},
doi = {10.1609/AAAI.V38I20.30205},
}