LLM Evaluations for MCQs (Cancer)

This repository aims to evaluate the LLM responses for MCQs for Cancer related questions

Setup

Create a new virtual_env and activate

    python3 -m venv <env_name>
    source <env_name>/bin/activate

Install the dependencies

    make setup

Set environment variables

Refer to the .env.example file and create a .env file with the required environment variables

    set -a
    source .env
    set +a

Format the code and start the service

    make format

    make start

To test the functionality for one query

Start the server make start
Navigate to http://127.0.0.1:8000/docs
Click on the Try it out under/evaluate_query endpoint

Synthetic Response Payload

{
  "query": "Which radionuclide was first used to noninvasively assess left ventricular ejection fraction and regional wall motion?",
  "options": "A. 99mTc-sestamibi B. Thallium-201 (201Tl) C. Potassium-43 (43K) D. 99mTc-labeled human serum albumin E. Rubidium-82 (82Rb) F. 13N-ammonia G. 18F-FDG H. 15O-water",
  "answer": "D. 99mTc-labeled human serum albumin",
  "question_format": "synthetic",
  "long_context": {
    "file_type": "pdf",
    "link_or_text": "data/Dataset_Eval/PubMed_Pdfs/1.pdf"
  }
}

Rephrase Response Payload

{
  "query": "Which radionuclide was first used to noninvasively assess left ventricular ejection fraction and regional wall motion?",
  "options": "A. 99mTc-sestamibi B. Thallium-201 (201Tl) C. Potassium-43 (43K) D. 99mTc-labeled human serum albumin E. Rubidium-82 (82Rb) F. 13N-ammonia G. 18F-FDG H. 15O-water",
  "answer": "D. 99mTc-labeled human serum albumin",
  "question_format": "rephrase",
  "long_context": {
    "file_type": "pdf",
    "link_or_text": "data/Dataset_Eval/PubMed_Pdfs/1.pdf"
  }
}

Raw Response Payload

{
  "query": "Which radionuclide was first used to noninvasively assess left ventricular ejection fraction and regional wall motion?",
  "options": "A. 99mTc-sestamibi B. Thallium-201 (201Tl) C. Potassium-43 (43K) D. 99mTc-labeled human serum albumin E. Rubidium-82 (82Rb) F. 13N-ammonia G. 18F-FDG H. 15O-water",
  "answer": "D. 99mTc-labeled human serum albumin",
  "question_format": "raw",
  "long_context": {
    "file_type": "pdf",
    "link_or_text": "data/Dataset_Eval/PubMed_Pdfs/1.pdf"
  }
}

Improvements

Incorporate LiteLLM framework
Use TogetherAI for OpenLLM calls
Simplify the codebase

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github		.github
mcqa		mcqa
.blackignore		.blackignore
.env.example		.env.example
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
poetry.lock		poetry.lock
processed_dataset.csv		processed_dataset.csv
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM Evaluations for MCQs (Cancer)

Setup

Format the code and start the service

To test the functionality for one query

Improvements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

SathvikNapa/MCQEvaluations

Folders and files

Latest commit

History

Repository files navigation

LLM Evaluations for MCQs (Cancer)

Setup

Format the code and start the service

To test the functionality for one query

Improvements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages