Transformer Implementation from Scratch 🚀

A PyTorch implementation of the Transformer architecture as described in "Attention Is All You Need". This project includes a complete, modular implementation of the Transformer machine translation tasks from english to italian.

🌟 Features

Complete transformer architecture implementation
Modular design with separate encoder and decoder components
Multi-head attention mechanism
Support for custom tokenization
Training and inference scripts included
Translation example implementation

🛠️ Components

model.py: Core transformer architecture
train.py: Training loop and utilities
translate.py: Inference and translation script
dataset.py: Data loading and preprocessing
config.py: Configuration and hyperparameters

🚀 Quick Start

# Clone the repository
git clone https://github.com/nevernever69/Transformer-in-pytorch.git
cd Transformer-in-pytorch

# Install requirements
pip install -r requirements.txt

# Train the model
python train.py

# Translate a sentence
python translate.py

or

Load pre-trained weights

Create necessary directories

mkdir -p opus_books_weights

## Download pre-trained weights and tokenizer files
- will update the instruction here, when weights upload finishes

📋 Model Architecture

Transformer
├── Encoder (6 layers)
│   ├── Multi-Head Attention
│   ├── Feed Forward Network
│   └── Layer Normalization
└── Decoder (6 layers)
    ├── Masked Multi-Head Attention
    ├── Multi-Head Attention
    ├── Feed Forward Network
    └── Layer Normalization

Results

Training

Processing Epoch 00: 100% 3638/3638 [23:45<00:00,  2.55it/s, loss=6.048]
Processing Epoch 01: 100% 3638/3638 [23:47<00:00,  2.55it/s, loss=5.207]
Processing Epoch 02: 100% 3638/3638 [23:47<00:00,  2.55it/s, loss=4.183]

Machine translation

Using device: cpu
    SOURCE: I am not a very good a student.
 PREDICTED: Io non ho il  il  .  ⏎

📚 Training Data

The model can be trained on any parallel corpus. The example implementation uses the Opus Books dataset from huggingface.

🤝 Contributing

Contributions are welcome! Feel free to submit pull requests or open issues for bugs and feature requests.

📝 License

MIT License - feel free to use this code for your own projects!

⭐️ Show Your Support

If you find this implementation helpful, give it a star! ⭐️

Special Thanks

Umar Jamil for his video on transformer from Scratch video
Campusx and CodeEmporium for helping me understand transformer

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
utils		utils
.gitignore		.gitignore
README.md		README.md
config.py		config.py
requirements.txt		requirements.txt
train.py		train.py
translate.py		translate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Transformer Implementation from Scratch 🚀

🌟 Features

🛠️ Components

🚀 Quick Start

Load pre-trained weights

Create necessary directories

📋 Model Architecture

Results

📚 Training Data

🤝 Contributing

📝 License

⭐️ Show Your Support

Special Thanks

About

Uh oh!

Releases

Packages

Uh oh!

Languages

nevernever69/Transformer-in-pytorch

Folders and files

Latest commit

History

Repository files navigation

Transformer Implementation from Scratch 🚀

🌟 Features

🛠️ Components

🚀 Quick Start

Load pre-trained weights

Create necessary directories

📋 Model Architecture

Results

📚 Training Data

🤝 Contributing

📝 License

⭐️ Show Your Support

Special Thanks

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages