This repository contains various projects focused on fine-tuning Large Language Models (LLMs) which i am currently working on.
I am still learning, and these projects are a work in progress. Some notebooks may not be fully complete and might contain errors. Contributions and feedback are welcome!
- Project Overview
- Setup Instructions
- Project Details
- Usage Guidelines
- Results and Evaluation
- References and Resources
- Contributing
- License
- Acknowledgments
This repository showcases diverse methodologies for fine-tuning Large Language Models (LLMs) on custom datasets:
- Personal Dataset Fine-Tuning: Standard techniques applied to user-specific datasets.
- Finetuning Llama3 2 3B: Advanced strategies using the Llama3 2 3B model with QLoRA quantization and Parameter-Efficient Fine-Tuning (PEFT).
- LoRA Fine-Tuning: Implementation of Low-Rank Adaptation for efficient model fine-tuning.
-
Clone the Repository:
git clone https://github.com/yourusername/LLM-Finetuning-Projects.git cd LLM-Finetuning-Projects
-
Create a Virtual Environment (Recommended):
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install Dependencies:
pip install -r requirements.txt
-
Jupyter Notebook Setup: Ensure Jupyter Notebook is installed:
pip install notebook
Launch Jupyter Notebook:
jupyter notebook
- Objective: Adapt LLMs to user-specific data for personalized applications.
- Methodology: Utilizes standard fine-tuning techniques on custom datasets.
- Dataset: [https://huggingface.co/datasets/mlabonne/FineTome-100k]
- Objective: Implement advanced fine-tuning using the Llama3 2 3B model.
- Techniques: Incorporates QLoRA quantization and PEFT for efficient training.
- Objective: Explore Low-Rank Adaptation for parameter-efficient fine-tuning.
- Methodology: Applies LoRA techniques to adapt pre-trained models with reduced computational resources.
-
Navigate to the Notebooks Directory:
cd notebooks
-
Open the Desired Notebook: Launch Jupyter Notebook:
jupyter notebook
Select the notebook of interest, e.g.,
finetuning_personal_dataset.ipynb
. -
Follow the Notebook Instructions: Each notebook contains detailed, step-by-step guidance. Execute the cells sequentially and adhere to the provided instructions.
- Metrics: Achieved an accuracy of 92% on the validation set.
- Sample Output:
Input: "Your sample input here" Output: "Model's generated response here"
- Metrics: Reduced perplexity score to 15.3.
- Visualizations: Include loss curves and accuracy charts.
- Hugging Face Transformers
- LoRA: Low-Rank Adaptation of Large Language Models
- QLoRA: Efficient Quantized Fine-Tuning
Contributions are welcome! Please follow these steps:
- Fork the repository.
- Create a new branch.
- Make your changes and commit them.
- Submit a pull request.
Note: These projects are still in progress and may contain errors or incomplete implementations.
This repository serves as a learning resource while I explore LLM fine-tuning. 🚀
This project is licensed under the MIT License.
Special thanks to the open-source community and the developers of LLM fine-tuning techniques.