This project fine-tunes the LLaMA 2 7B model using LoRA to build a simple doctor chatbot. The model is trained on instruction-style medical questions (MedAlpaca / MedQA-style) to respond to health-related prompts in a helpful and conversational way.
All training and testing was done in Google Colab using 4-bit quantized weights to keep it memory-efficient.
How It Works
- Loads the Meta LLaMA 2 7B model with 4-bit quantization using bitsandbytes
- Applies LoRA (Low-Rank Adaptation) for parameter-efficient fine-tuning
- Tokenizes and formats medical questions into instruction-response prompts
- Fine-tunes the model on 1,000 examples from the
medalpaca/medical_meadow_medqa
dataset - Trains using HuggingFace Transformers + PEFT
Training Status
Training ran successfully up to step 740 out of 750. The process was stopped because Colab GPU quota was exhausted.
The model was saved and is inference-ready, but final evaluation was not completed.
Running the Code
Install dependencies:
pip install -r requirements.txt
To run inference:
python src/inference.py
To resume or reproduce training:
python src/train.py
Project Structure :
doctor-chatbot/
├── src/ # Training and inference scripts
├── results/ # Placeholder for evaluation results or notes
├── requirements.txt # Python dependencies
└── README.md # Project documentation
Note
This project was implemented and tested manually in Google Colab. The full training could not complete due to GPU limits, but the setup is available for further fine-tuning or evaluation.
Author
Built with ❤️ by Ritesh