|  | 
| 1 | 1 | # 🎋🌿 **Production-Ready Instruction Fine-Tuning of Meta LLaMA 3.2 3B Instruct Project** 🌿🎉   | 
| 2 |  | -updating soon:  | 
| 3 | 2 | 
 | 
| 4 | 3 | ## **Problem Statement**   | 
| 5 | 4 | ---   | 
| @@ -29,11 +28,11 @@ To achieve this, we are leveraging the **Hugging Face dataset** `charanhu/kannad | 
| 29 | 28 | 
 | 
| 30 | 29 | ### **My Role as a Developer** 🎋   | 
| 31 | 30 | 
 | 
| 32 |  | -As a developer, I am responsible for delivering a fine-tuned **LLaMA 3.2 3B** model that aligns with the defined **Key Performance Indicator (KPI)** objectives and ensures exceptional performance for Kannada-speaking users.   | 
|  | 31 | +As a developer, I am responsible for delivering a Instruction fine-tuned **LLaMA 3.2 3B** model that aligns with the defined **Key Performance Indicator (KPI)** objectives and ensures exceptional performance for Kannada-speaking users.   | 
| 33 | 32 | 
 | 
| 34 | 33 | - I will **instruct fine-tune** the model using the high-quality **Kannada dataset** from **Hugging Face** (`charanhu/kannada-instruct-dataset-390k`).   | 
| 35 | 34 | 
 | 
| 36 |  | -- To address the constraints of **limited GPU resources**, I will implement **QLoRA-based 4-bit precision quantization** using **BitsAndBytes**, which involves:   | 
|  | 35 | +- To address the constraints of **limited GPU resources**, I will implement **QLoRA-based 4-bit precision quantization** using **Unsloth**, which involves:   | 
| 37 | 36 |   - First **quantizing the model** to 4-bit precision to reduce computational overhead.   | 
| 38 | 37 |   - Adding **LoRA (Low-Rank Adaptation) layers** to fine-tune the model efficiently within **Google Colab**, ensuring optimal resource utilization without compromising performance.   | 
| 39 | 38 | 
 | 
| @@ -147,19 +146,101 @@ Remember: For this project **Pipeline** is going to be seprated in two different | 
| 147 | 146 |  | 
| 148 | 147 | 
 | 
| 149 | 148 | 
 | 
|  | 149 | +*Note: Fine-tuning code will be entirely modular, but I have used **Google Colab** for training, if you have high-end machine make sure you execute **pipeline** in modular fashin* | 
| 150 | 150 | 
 | 
| 151 | 151 | ## Fine-tuning Pipeline 💥 | 
| 152 | 152 | --- | 
|  | 153 | +### Installing the required libraries | 
|  | 154 | +* Unsloth gives a lot of issues while installing, so execute these code cells one by one in sequence to avoid any problems. | 
| 153 | 155 | 
 | 
|  | 156 | +````bash | 
|  | 157 | +# Run this first (cell 1) | 
|  | 158 | +!python -m pip install --upgrade pip | 
|  | 159 | +!pip install --upgrade torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 | 
|  | 160 | +!pip install xformers[torch2]  # Install xformers built for PyTorch 2.x | 
|  | 161 | +!pip install "unsloth[colab] @ git+https://github.com/unslothai/unsloth.git" | 
|  | 162 | +!pip install "git+https://github.com/huggingface/transformers.git" | 
|  | 163 | +!pip install trl | 
|  | 164 | +!pip install boto3 | 
|  | 165 | +```` | 
| 154 | 166 | 
 | 
| 155 |  | ---- | 
|  | 167 | +```bash | 
|  | 168 | +# Run this cell (cell 2) | 
|  | 169 | +!pip install --upgrade torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118  # Upgrade PyTorch to a compatible version | 
|  | 170 | +!pip install xformers  # Install xformers after upgrading PyTorch | 
|  | 171 | +``` | 
|  | 172 | + | 
|  | 173 | +```bash | 
|  | 174 | +# cell 3 | 
|  | 175 | +!pip uninstall torch torchvision torchaudio -y  # Uninstall existing PyTorch, torchvision, and torchaudio | 
|  | 176 | +!pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118  # Install PyTorch, torchvision, and torchaudio with CUDA 11.8 | 
|  | 177 | +``` | 
|  | 178 | + | 
|  | 179 | +```bash | 
|  | 180 | +# cell 4 | 
|  | 181 | +!pip uninstall xformers -y | 
|  | 182 | +!pip install xformers[torch2]  # Install xformers built for PyTorch 2.x | 
|  | 183 | +``` | 
|  | 184 | + | 
|  | 185 | +### Importing Necessary Libraries | 
|  | 186 | + | 
|  | 187 | +<img width="656" alt="Importing Necessary Libraries" src="https://github.com/user-attachments/assets/dfb4fdee-0513-4202-b5d1-167e15689354"> | 
|  | 188 | + | 
|  | 189 | +###  Loading the Model | 
|  | 190 | + | 
|  | 191 | +<img width="640" alt="Loading  Model" src="https://github.com/user-attachments/assets/89013450-1bb1-4a29-9ad4-2a620004064e"> | 
|  | 192 | + | 
|  | 193 | +### Applying  Lora layers | 
|  | 194 | + | 
|  | 195 | +<img width="620" alt="Applying  Lora" src="https://github.com/user-attachments/assets/062a2115-d24d-4ede-9c83-2fc9665cdaa1"> | 
|  | 196 | + | 
|  | 197 | +### Data Preparation | 
|  | 198 | + | 
|  | 199 | +<img width="920" alt="Dataset Preparation" src="https://github.com/user-attachments/assets/869f6569-df05-455f-bd7e-ba71dc036593"> | 
|  | 200 | + | 
|  | 201 | +### Data Formatting(what model expects for instruction tuning) | 
|  | 202 | + | 
|  | 203 | +<img width="920" alt="Prompt Formatting" src="https://github.com/user-attachments/assets/58f7c5cf-945a-43d7-a9cf-670eee3261e6"> | 
|  | 204 | + | 
|  | 205 | + | 
|  | 206 | +###  Training Configurations | 
|  | 207 | + | 
|  | 208 | +<img width="614" alt="Training Configuration" src="https://github.com/user-attachments/assets/956acc04-ac6f-497b-9c12-9cc33b70301b"> | 
| 156 | 209 | 
 | 
| 157 |  | -## Deployment/Inference Pipeline 💥 | 
| 158 |  | ---- | 
|  | 210 | + | 
|  | 211 | +### Model Training | 
|  | 212 | + | 
|  | 213 | +<img width="856" alt="Model  Training" src="https://github.com/user-attachments/assets/075ee343-8412-4ad4-bb4b-dd569663c4fd"> | 
|  | 214 | + | 
|  | 215 | +### Inference | 
|  | 216 | + | 
|  | 217 | +<img width="713" alt="Inference  1" src="https://github.com/user-attachments/assets/189c2d17-9026-4cb3-bdfb-95435b075fae"> | 
|  | 218 | + | 
|  | 219 | +<img width="901" alt="Inference 2" src="https://github.com/user-attachments/assets/ea31462b-9e1c-4575-9120-5390cfbc23e2"> | 
|  | 220 | + | 
|  | 221 | +### Saving the Model & Tokenizer | 
|  | 222 | + | 
|  | 223 | +<img width="453" alt="Saving the model and tokenizer" src="https://github.com/user-attachments/assets/f6eb0858-f51e-452d-a65b-83945537e487"> | 
|  | 224 | + | 
|  | 225 | +### Merging base model & finetuned lora layers | 
|  | 226 | + | 
|  | 227 | +<img width="557" alt="Merge base model and finetuned layers" src="https://github.com/user-attachments/assets/15d66a2b-dfb9-471c-8fe0-9b13640d45e4"> | 
|  | 228 | + | 
|  | 229 | + | 
|  | 230 | +### Pushing Model & Tokenizer to S3 Bucket | 
|  | 231 | + | 
|  | 232 | + | 
|  | 233 | +<img width="399" alt="Pushing to s3 1" src="https://github.com/user-attachments/assets/06948b95-59a6-4ad5-b530-90e075cc88f9"> | 
|  | 234 | + | 
|  | 235 | + | 
|  | 236 | +<img width="527" alt="Pushing to s3 2" src="https://github.com/user-attachments/assets/2d944deb-b2f1-475a-834e-d462bb08fffb"> | 
|  | 237 | + | 
|  | 238 | +<img width="505" alt="Pushing to s3 3" src="https://github.com/user-attachments/assets/7fd11f13-57f2-43b0-b3e9-918e89b91b12"> | 
| 159 | 239 | 
 | 
| 160 | 240 | 
 | 
| 161 | 241 | --- | 
| 162 | 242 | 
 | 
|  | 243 | + | 
| 163 | 244 | ## Ok, so now let's Talk about the Deployment/Inference Pipeline  🚀 | 
| 164 | 245 | 
 | 
| 165 | 246 | *This is the diagram, of how the pipeline will look:* | 
|  | 
0 commit comments