A Language Model fine-tuned for medical applications, progressing from pretraining to instruction fine-tuning and Direct Preference Optimization (DPO).
- Pretraining: Medical Text Dataset (Kaggle)
- Fine-tuning: PMC LLaMA Instructions (Hugging Face)
-
Pretraining
- Custom GPT model on medical texts
-
Instruction Fine-tuning
- Used LitGPT for LoRA fine-tuning on instruction dataset
-
Direct Preference Optimization (DPO)
- Generated variants using fine-tuned model
- Created preference pairs based on Levenshtein distance
- Customized for medical domain
- Progression from general language model to instruction-following
- Experiment with preference optimization
- Larger medical datasets
- Advanced DPO techniques
- Multi-task learning in medical domain
- Benchmark evaluation:
- Compare against established medical NLP models
- Evaluate on standardized medical QA datasets
- Assess performance on clinical decision support tasks