Skip to content

abhishek0093/gazal-e-gpt

Repository files navigation

Gazal-e-GPT

Background

I've always been fascinated by how poets and writers beautifully express their thoughts through poems and stories. Honestly, I wasn’t very good at it myself—but with the rise of GenAI, that doesn't matter anymore. So I decided to create something that could write better than me: Gazal-e-GPT.

Instead of just fine-tuning a pre-trained model (which would’ve been easier), I took it as a learning opportunity. I built the entire GPT model from scratch, coding every layer and connection and then training it step by step.

Training Process

Since I was limited by my MacBook Pro and MPS device support, training was slow. But with a few tweaks, Google Colab offered a much better performance.

Dataset

For Hindi-Urdu ghazal-style text, there isn't much data available. I used the dataset from this repo, focusing only on Hindi versions for now.
The dataset is around 2MB, which is small compared to the model size. Increasing the dataset size would definitely improve the model’s performance.

Training Graphs

image
image
image

Outputs

User Input: चलो आज फिर चलते हैं
Epoch-1 Model Output: खुदा से ये गुजारिशમે साला<<reserved_token_3325>> Agency<<reserved_token_4075>> ਲੰਬਾ ਜਾਣਕਾਰੀ ਨਿਰਧਾਰਤ ਮੌਜੂਦsd റൂ females females AT ವ್ಯವಹ Illహంpret<<reserved_token_2098>> dietsm गोष्टी used વેપ ਗਵਰਨਰસંગતitory ವ್ಯವಹನಿಯನ್ गोंൃതിसंबंध British ਪਾਲ খেলেনটো ನಿರ್ವಹಿಸಲುसतनਗਤ<<reserved_token_2313>>ಬರ್ pictures

Epoch-800 Model Output: चलो आज फिर चलते हैं सुनते थे दिल की आग उस की तबी दयारों पे हम भी अब दादे हैं तो जी बहकी रुफ़ाई है पयाम रख देंगी गुज़राज़ारे लोग मिले भी अब तक लेकिन एक दिलबर नहीं है उल्फ

Future Improvement Plans

  • The current tokenizer is from Sarvamai. While it's great for Indic languages, it's not specifically designed for Urdu, which impacts performance.
  • Increasing dataset size and using CUDA for faster training.
  • Pretraining on general Hindi-Urdu text and poetic data, followed by fine-tuning with ghazal-style text to better align outputs to desired gazals style.

References

About

GPT-2 model built from scratch and trained to generate Gazals

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published