🧠 Med-NOTA: Evaluating "None of the other Answers" in Medical QA

A simple tool for analyzing how well language models handle "None of the other Answers" (NOTA) options in medical question answering, especially under Chain-of-Thought (CoT) reasoning.

📌 What This Does

This project investigates whether large language models (LLMs) like GPT, Claude, Deepseek-R1, and others can reliably identify when none of the answer choices are correct in medical multiple-choice questions. It compares performance with and without the need to recognize NOTA.

🚀 Quick Start

1. Set up your environment

conda env create -f environment.yaml
conda activate cot-eval

2. Configure your API key

Before running any experiments, add your API key to the config file at:

scripts/config.py

Then, add the model endpoints at:

scripts/src/medqa_nato.py

3. Process the data

cd scripts/data
python3 load_data.py

4. Run the NOTA experiments

cd ../src
python3 medqa_nato.py

5. Analyze the results

python3 nota_accuracy_stats.py

📊 What the Analysis Shows

✅ Accuracy comparisons between regular CoT and NOTA conditions
📈 Confidence intervals for model performance
🧪 P-values for statistical significance testing
🔍 Question-level insights: which questions showed the biggest drops in accuracy

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data		data
scripts		scripts
README.md		README.md
environment.yaml		environment.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠 Med-NOTA: Evaluating "None of the other Answers" in Medical QA

📌 What This Does

🚀 Quick Start

1. Set up your environment

2. Configure your API key

3. Process the data

4. Run the NOTA experiments

5. Analyze the results

📊 What the Analysis Shows

About

Uh oh!

Releases

Packages

Languages

som-shahlab/med-nota

Folders and files

Latest commit

History

Repository files navigation

🧠 Med-NOTA: Evaluating "None of the other Answers" in Medical QA

📌 What This Does

🚀 Quick Start

1. Set up your environment

2. Configure your API key

3. Process the data

4. Run the NOTA experiments

5. Analyze the results

📊 What the Analysis Shows

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages