Generative Large Language Models Trained for Detecting Errors in Radiology Reports

This is the repository for Generative Large Language Models Trained for Detecting Errors in Radiology Reports.

Overview

The overall workflow of large language models (LLMs).

Our work consists of three phases:

(1). Dataset Construction

(2). Model Development

(3). Evaluation

Dataset Construction

We constructed a dataset consisting of two parts.

The first part includes 1,656 synthetic radiology reports generated by GPT-4 using specified prompts, divided into 828 error-free synthetic reports and 828 synthetic reports with errors.

Please refer to Prompts_for_Synthetic.txt

The second part comprises 614 reports: 307 errorfree reports from the MIMIC-CXR database, and 307 corresponding synthetic reports with errors generated by GPT-4 based on these MIMIC-CXR reports and specified prompts.

Please refer to Prompts_for_MIMIC.txt

Model Development

We fine-tune our models using Firefly codes.

Please refer to Firefly(https://github.com/yangjianxin1/Firefly)

Llama-3-8B-Instruct and Llama-3-70B-Instruct are fine-tuned on the training set with the following hyperparameters:

Hyperparameter	Llama-3-8B-Instruct	Llama-3-70B-Instruct
Batch size	1	1
Learning rate	3e-4	3e-4
Epochs	3	3
Max length	512	512

Evaluation

We evaluated the performance of models such as Llama-3 and GPT-4 on the test set.

Please refer to demo.ipynb for the relevant code.

Citation

Please cite the repo if you use the data or code in this repository.

@article{sun2025generative,
  title={Generative large language models trained for detecting errors in radiology reports},
  author={Sun, Cong and Teichman, Kurt and Zhou, Yiliang and Critelli, Brian and Nauheim, David and Keir, Graham and Wang, Xindi and Zhong, Judy and Flanders, Adam E and Shih, George and others},
  journal={Radiology},
  volume={315},
  number={2},
  pages={e242575},
  year={2025},
  publisher={Radiological Society of North America}
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
LICENSE		LICENSE
Prompts_for_MIMIC		Prompts_for_MIMIC
Prompts_for_Synthetic		Prompts_for_Synthetic
README.md		README.md
demo.ipynb		demo.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Generative Large Language Models Trained for Detecting Errors in Radiology Reports

Overview

Dataset Construction

Model Development

Evaluation

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

bionlplab/llm4proofreading

Folders and files

Latest commit

History

Repository files navigation

Generative Large Language Models Trained for Detecting Errors in Radiology Reports

Overview

Dataset Construction

Model Development

Evaluation

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages