This repository contains implementations of Sequence-to-Sequence (Seq2Seq) neural models for English → Hindi Neural Machine Translation (NMT), including a baseline architecture and an improved attention-based model.
-
Dataset: 100k parallel English–Hindi sentences
- 70k for training, 30k for testing
-
Epochs: 15
-
Batch Size: 64
-
Framework: PyTorch
-
Loss Function: Negative Log Likelihood Loss (NLLLoss)
-
Optimizer: Adam
-
Seq2Seq Architecture Based on Sequence to Sequence Learning with Neural Networks (Sutskever et al.), comprising:
- Encoder (LSTM) → Context Vector → Decoder (LSTM) → Word Predictor.
-
Seq2Seq with Attention Based on Neural Machine Translation by Jointly Learning to Align and Translate (Bahdanau et al.), which introduces an attention mechanism for better alignment and handling of long sentences.
- Baseline Seq2Seq Model: Produces syntactically correct outputs but often repetitive or contextually weak.
- Attention-based Model: Better at focusing on important input words, resulting in translations more relevant to the target sentence, especially for longer sequences.
- Observation: Attention improves translation quality and contextual accuracy compared to plain Seq2Seq.
seq2seq.ipynb
→ Basic Seq2Seq model implementationseq2seq_attention.ipynb
→ Seq2Seq model with attention mechanismPaper2.csv
→ Sample dataset used for experimentsreport
→ Explaining project with few tested examples and observations
-
Open and run notebooks:
seq2seq.ipynb seq2seq_attention.ipynb
-
Adjust dataset path and training hyperparameters as needed.
- Python 3.x
- PyTorch
- NumPy, Matplotlib
- Jupyter Notebook
This project is licensed under the MIT License.