Gazal-e-GPT

Background

I've always been fascinated by how poets and writers beautifully express their thoughts through poems and stories. Honestly, I wasn’t very good at it myself—but with the rise of GenAI, that doesn't matter anymore. So I decided to create something that could write better than me: Gazal-e-GPT.

Instead of just fine-tuning a pre-trained model (which would’ve been easier), I took it as a learning opportunity. I built the entire GPT model from scratch, coding every layer and connection and then training it step by step.

Training Process

Since I was limited by my MacBook Pro and MPS device support, training was slow. But with a few tweaks, Google Colab offered a much better performance.

Dataset

For Hindi-Urdu ghazal-style text, there isn't much data available. I used the dataset from this repo, focusing only on Hindi versions for now.
The dataset is around 2MB, which is small compared to the model size. Increasing the dataset size would definitely improve the model’s performance.

Training Graphs

Outputs

User Input: चलो आज फिर चलते हैं
Epoch-1 Model Output: खुदा से ये गुजारिशમે साला<<reserved_token_3325>> Agency<<reserved_token_4075>> ਲੰਬਾ ਜਾਣਕਾਰੀ ਨਿਰਧਾਰਤ ਮੌਜੂਦsd റൂ females females AT ವ್ಯವಹ Illహంpret<<reserved_token_2098>> dietsm गोष्टी used વેપ ਗਵਰਨਰસંગતitory ವ್ಯವಹನಿಯನ್ गोंൃതിसंबंध British ਪਾਲ খেলেনটো ನಿರ್ವಹಿಸಲುसतनਗਤ<<reserved_token_2313>>ಬರ್ pictures

Epoch-800 Model Output: चलो आज फिर चलते हैं सुनते थे दिल की आग उस की तबी दयारों पे हम भी अब दादे हैं तो जी बहकी रुफ़ाई है पयाम रख देंगी गुज़राज़ारे लोग मिले भी अब तक लेकिन एक दिलबर नहीं है उल्फ

Future Improvement Plans

The current tokenizer is from Sarvamai. While it's great for Indic languages, it's not specifically designed for Urdu, which impacts performance.
Increasing dataset size and using CUDA for faster training.
Pretraining on general Hindi-Urdu text and poetic data, followed by fine-tuning with ghazal-style text to better align outputs to desired gazals style.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
dataset		dataset
plots		plots
resources		resources
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
code_and_train_gpt2.ipynb		code_and_train_gpt2.ipynb
inference.ipynb		inference.ipynb
logs_2025-01-12.csv		logs_2025-01-12.csv
model_800.ckpt		model_800.ckpt
plot_metrices.ipynb		plot_metrices.ipynb
requirements.txt		requirements.txt
shayar.txt		shayar.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Gazal-e-GPT

Background

Training Process

Dataset

Training Graphs

Outputs

Future Improvement Plans

References

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

abhishek0093/gazal-e-gpt

Folders and files

Latest commit

History

Repository files navigation

Gazal-e-GPT

Background

Training Process

Dataset

Training Graphs

Outputs

Future Improvement Plans

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages