Bias Detection and Debiasing in Clinical Text Using LLMs

🧠 Project Overview

This project investigates and mitigates language bias in clinical notes, focusing on how machine learning models can detect and perpetuate implicit biases—especially racial and gender-related—in medical documentation. Inspired by recent works such as "Write It Like You See It" and "Hurtful Words", our approach combines the strengths of BERT-based language models for bias detection and GPT-based models for text debiasing.

This was developed as part of the NLP_AIML course project.

🩺 Motivation

Machine learning models trained on clinical data often absorb and reinforce systemic biases. For example, even when race is redacted, models can infer it from implicit textual cues, leading to unfair clinical recommendations.

Our goal is twofold:

Detect whether a piece of clinical text contains latent bias using transformer-based models.
Debias these texts to ensure fairer downstream predictions using autoregressive LLMs (GPT-style).

🧰 Technologies & Tools

Python 3.9+
Transformers (HuggingFace)
PyTorch
BERT (ClinicalBERT, SciBERT)
GPT-3.5 / GPT-4 API (for generative debiasing)
scikit-learn (for evaluation and metrics)
spaCy / NLTK (for preprocessing)

🧪 Methodology

1. Dataset

Clinical notes sourced from MIMIC-III and Columbia University Medical Center datasets.
Notes redacted for explicit race indicators (e.g., "Black", "White", "Caucasian").

2. Bias Detection

Fine-tuned models (Logistic Regression, XGBoost, ClinicalBERT, SciBERT) classify whether notes imply a specific racial identity.
Achieved high AUC even on redacted notes—proving models can learn implicit bias.

3. Human Evaluation

Compared model predictions to a panel of physicians—who performed no better than random, emphasizing the non-obvious nature of bias.

4. Debiasing Approach

Prompts crafted for GPT-based models to rewrite biased clinical notes while preserving clinical meaning.
Target: reduce associations between demographic proxies and clinical descriptions.

📊 Results

Detection Accuracy: ClinicalBERT ensemble reached AUC ~0.83.
Bias Examples:
- Words like "bruising", "paleness", or "family support" skew predictions racially.
Post-Debiasing Evaluation:
- Reduced racial/gender associations.
- Maintained core clinical content as verified by cosine similarity and manual checks.

📁 Project Structure

bias-clinical-nlp/
│
├── data/                  # Raw and redacted clinical notes
├── models/                # Fine-tuned BERT and XGBoost classifiers
├── debiasing/
│   └── gpt_rewriter.py    # Prompts GPT to generate unbiased text
├── notebooks/
│   └── bias_detection.ipynb
│   └── debiasing_pipeline.ipynb
├── utils/
│   └── preprocessing.py   # Tokenization, redaction, filtering
│
├── requirements.txt
└── README.md

🚀 Running the Project

Install dependencies

pip install -r requirements.txt

Run detection pipeline

python notebooks/bias_detection.ipynb

Run GPT-based debiasing

python debiasing/gpt_rewriter.py --input data/redacted_notes.txt

Evaluate debiased output

python notebooks/debiasing_pipeline.ipynb

📚 References

Adam, H. et al. Write It Like You See It: Detectable Differences in Clinical Notes by Race Lead To Differential Model Recommendations (AIES 2022)
Zhang, H. et al. Hurtful Words: Quantifying Biases in Clinical Contextual Word Embeddings (CHIL 2020)

🧑‍⚕️ Ethical Considerations

All patient data was de-identified.
GPT outputs were carefully reviewed to avoid clinical misinterpretation.
This work does not replace clinical judgment and is meant to highlight algorithmic bias, not to automate care.

📌 Future Work

Explore RLHF for better control in text debiasing.
Extend bias detection beyond race (e.g., gender, insurance status, language).
Integrate into clinical decision support tools with real-time debiasing layers.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
metrics		metrics
research_paper		research_paper
validation_results		validation_results
.gitignore		.gitignore
Code_1.ipynb		Code_1.ipynb
Code_2.ipynb		Code_2.ipynb
Data_Fetch.ipynb		Data_Fetch.ipynb
GPT_Bias_Classification_Fine_Tuning.ipynb		GPT_Bias_Classification_Fine_Tuning.ipynb
GPT_based_Debaising.ipynb		GPT_based_Debaising.ipynb
PreGenerate_Embeddings.ipynb		PreGenerate_Embeddings.ipynb
README.md		README.md
SVM_Bases_Bais_Detection.ipynb		SVM_Bases_Bais_Detection.ipynb
app_BERT_LLM.py		app_BERT_LLM.py
bias_model_BERT.py		bias_model_BERT.py
bias_model_BERT_validation.py		bias_model_BERT_validation.py
data.py		data.py
unbias_model_LLM.py		unbias_model_LLM.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Bias Detection and Debiasing in Clinical Text Using LLMs

🧠 Project Overview

🩺 Motivation

🧰 Technologies & Tools

🧪 Methodology

1. Dataset

2. Bias Detection

3. Human Evaluation

4. Debiasing Approach

📊 Results

📁 Project Structure

🚀 Running the Project

📚 References

🧑‍⚕️ Ethical Considerations

📌 Future Work

📝 License

About

Uh oh!

Releases

Packages

Languages

imranadas/Clinical_BiasDetection_Debiasing

Folders and files

Latest commit

History

Repository files navigation

Bias Detection and Debiasing in Clinical Text Using LLMs

🧠 Project Overview

🩺 Motivation

🧰 Technologies & Tools

🧪 Methodology

1. Dataset

2. Bias Detection

3. Human Evaluation

4. Debiasing Approach

📊 Results

📁 Project Structure

🚀 Running the Project

📚 References

🧑‍⚕️ Ethical Considerations

📌 Future Work

📝 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages