🧠 Fine-Tuning Qwen3-0.6B using Unsloth for Reasoning & Chat

This project demonstrates how to fine-tune the lightweight Qwen3-0.6B language model using the Unsloth library. It combines Chain-of-Thought (CoT) reasoning and chat-style datasets for enhanced multiturn reasoning and conversation abilities using efficient LoRA adapters and 4-bit quantization.

🚀 Highlights

✅ Based on Qwen3-0.6B – a fast, efficient LLM from Alibaba
✅ Uses Unsloth for blazing-fast fine-tuning and LoRA support
✅ Incorporates reasoning and non-reasoning data for balanced learning
✅ Supports thinking-mode inference using <think> tags
✅ Designed for low-resource environments with 4-bit quantization
✅ Modular, extensible, and easy to adapt to your own datasets

📦 Installation

Install required libraries:

pip install --no-deps bitsandbytes accelerate xformers==0.0.29.post3 peft trl==0.15.2 triton cut_cross_entropy unsloth_zoo
pip install sentencepiece protobuf datasets huggingface_hub hf_transfer
pip install --no-deps unsloth

📁 Datasets

Dataset	Type	Description
`unsloth/OpenMathReasoning-mini`	Reasoning	Math problems with Chain-of-Thought answers
`mlabonne/FineTome-100k`	Chat/Non-CoT	Standard ShareGPT-style conversations

You can easily swap in your own datasets using the standardized format.

🧪 Training Pipeline

Load and Quantize the Model
Add LoRA adapters
Prepare reasoning and non-reasoning data
Balance the dataset with a configurable chat_percentage
Tokenize and format inputs using Unsloth’s chat template
Train using TRL’s SFTTrainer
Test inference with and without <think> tags
Save LoRA adapters locally or push to the Hugging Face Hub

🧠 Inference Demo

messages = [{"role": "user", "content": "Solve (x + 2)^2 = 0."}]
text_input = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=True,  # Set to False for non-CoT response
)

🧰 Model Saving

After training:

model.save_pretrained("qwen3_0.6b_reasoning_chat_lora")
tokenizer.save_pretrained("qwen3_0.6b_reasoning_chat_lora")

Optionally, push to the 🤗 Hub:

model.push_to_hub("your-username/qwen3_0.6b_reasoning_chat_lora", token="your_token")
tokenizer.push_to_hub("your-username/qwen3_0.6b_reasoning_chat_lora", token="your_token")

🛠️ Configuration

You can tweak training and adapter parameters in the script:

# LoRA config
r = 32
lora_alpha = 32
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]

# SFT config
batch_size = 2
gradient_accumulation = 4
learning_rate = 2e-4
max_steps = 30

📊 Example Output

With Thinking Enabled (<think>):

<think>
Let’s solve (x + 2)^2 = 0.
Take the square root of both sides:
x + 2 = 0
=> x = -2
</think>
x = -2

Without Thinking:

x = -2

🧱 Dependencies

unsloth
transformers, trl, datasets
bitsandbytes, xformers, peft
torch, sentencepiece

📌 Credits

Unsloth for efficient fine-tuning
Alibaba Qwen team for the model
Mlabonne and others for open datasets

🧠 License

This project follows the Apache 2.0 License, as per the base model and Unsloth.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Experimental		Experimental
README.md		README.md
finetune-qwen3-0-6b-unsloth.ipynb		finetune-qwen3-0-6b-unsloth.ipynb
finetune_qwen3_using_unsloth.py		finetune_qwen3_using_unsloth.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠 Fine-Tuning Qwen3-0.6B using Unsloth for Reasoning & Chat

🚀 Highlights

📦 Installation

📁 Datasets

🧪 Training Pipeline

🧠 Inference Demo

🧰 Model Saving

🛠️ Configuration

📊 Example Output

🧱 Dependencies

📌 Credits

🧠 License

About

Uh oh!

Releases

Packages

Languages

ambideXtrous9/Finetune-Qwen3-using-Unsloth

Folders and files

Latest commit

History

Repository files navigation

🧠 Fine-Tuning Qwen3-0.6B using Unsloth for Reasoning & Chat

🚀 Highlights

📦 Installation

📁 Datasets

🧪 Training Pipeline

🧠 Inference Demo

🧰 Model Saving

🛠️ Configuration

📊 Example Output

🧱 Dependencies

📌 Credits

🧠 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages