Natural Hallucination Dataset - ACI-bench Clinical Note Hallucination Annotations

This repository contains expert-annotated hallucination labels from the ACI-bench dataset for evaluating hallucination detection in medical text summarization.

Dataset Overview

The Natural Hallucination (NH) dataset contains expert annotations of hallucinations in clinical summaries, focused on SOAP notes from the ACI-bench collection of clinical conversations.

Annotation Categories & Counts

Expert clinical scribes annotated statements into 4 categories with the following distribution:

No Error: 12,365
Hallucination: 106
Inference: 87
Misunderstanding: 72

Error Severity Distribution

The errors were classified by severity:

Low Severity: 138
High Severity: 87
Not Medically Relevant (NMR): 40

High Severity Categories

The following categories are marked as high severity errors:

Diagnosis
Exam Findings
Lab Testing and Imaging
Medical History
Symptoms
Treatment Plan

Age & Sex errors are considered low severity.

Dataset Format

The released dataset contains:

Original ACI-bench conversation transcripts
Expert annotations of factual errors marked by category
Severity labels for each error
Aggregated error scores per subject

Usage

The annotations can be used to:

Evaluate hallucination detection methods
Analyze different types of factual errors in clinical summarization
Study high vs low severity errors in medical text generation

Citation

If you use this dataset, please cite:

Fact-Controlled Diagnosis of Hallucinations in Medical Text Summarization. BN, S., Shing, H.-C., Xu, L., Strong, M., Burnsky, J., Ofor, J., Mason, J. R., Chen, S., Srinivasan, S., Shivade, C., Moriarty, J., & Cohen, J. P. Interspeech 2025

@inproceedings{BN2024fact,
  title={Fact-Controlled Diagnosis of Hallucinations in Medical Text Summarization},
  author={BN, Suhas and Shing, Han-Chin and Xu, Lei and Strong, Mitch and Burnsky, Jon and Ofor, Jessica and Mason, Jordan R and Chen, Susan and Srinivasan, Sundararajan and Shivade, Chaitanya and  Moriarty, Jack and Cohen, Joseph Paul},
  booktitle={Interspeech},
  year={2025},
  organization={ISCA}
}

Note

This release contains only the expert annotations on the ACI Bench summaries. The LLM outputs could not be made public due to license issues.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NaturalHallucinationDataset-Labels-open.csv		NaturalHallucinationDataset-Labels-open.csv
NaturalHallucinationDataset-Outputs-open.csv		NaturalHallucinationDataset-Outputs-open.csv
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Natural Hallucination Dataset - ACI-bench Clinical Note Hallucination Annotations

Dataset Overview

Annotation Categories & Counts

Error Severity Distribution

High Severity Categories

Dataset Format

Usage

Citation

Note

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

License

amazon-science/acibench-hallucination-annotations

Folders and files

Latest commit

History

Repository files navigation

Natural Hallucination Dataset - ACI-bench Clinical Note Hallucination Annotations

Dataset Overview

Annotation Categories & Counts

Error Severity Distribution

High Severity Categories

Dataset Format

Usage

Citation

Note

About

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Packages