LLAMA3.2-Nepali-318M (from scratch) #570
Aananda-giri
started this conversation in
Show and tell
Replies: 1 comment
-
That's great! Thanks for sharing this! I think smaller LLM like these are super nice because it makes it easier to experiment with them. There are also several readers asking me about non-English LLMs, and that's another nice resource to refer them to. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone! 👋
I’m thrilled to share my recent project: LLAMA3.2-Nepali-318M, a LLAMA3.2 model pretrained from scratch for the Nepali language. This project builds upon the LLAMA-3.2 model training code detailed in Build a Large Language Model (From Scratch), adapting it specifically for the Nepali language.
🔗 Project Links
💬 Chat Interface: Hugging Face Space
📦 Pre-Trained Model: Hugging Face Model
💻 Training Code: GitHub Repository
📊 Dataset: IRIISNEPAL/Nepali-Text-Corpus | NepBERTa
🔍 Major Changes from Original Code
1️⃣ New Model Configurations
3️⃣ Dataset
2️⃣ Tokenizer
4️⃣ Dataloader:
✨ Final Thoughts
A huge thank you to @rasbt for the inspiration and for writing an incredible book—the best books on LLMs I’ve read!
🤗 Happy coding! 🎉
Beta Was this translation helpful? Give feedback.
All reactions