This project explores Natural Language Inference (NLI) in a multilingual setting by evaluating the performance of translated English premise and hypothesis pairs. The project uses state-of-the-art machine translation techniques to convert English pairs into French and then performs NLI classification tasks to predict logical relationships between the statements.
Translational_NLI/
├── Classification_Final/ # NLI classification models and pipeline
├── Translation_Final/ # Machine translation implementation
├── Evaluation_Final/ # Model evaluation and analysis
├── Output/ # Results and predictions
└── README.md # Project documentation
The classification system implements multiple approaches for Natural Language Inference:
- Traditional ML Models: Multinomial Naive Bayes classifier
- Deep Learning Models:
- CNN (Convolutional Neural Network) with custom architecture
- Transformer-based models (BERT, RoBERTa, ELECTRA, XLNet)
- Data Processing: Text preprocessing, tokenization, and Word2Vec embeddings
- Evaluation Metrics: Accuracy, Precision, Recall, F1-score, Cohen's Kappa
main.py
: Main execution script for training and evaluationmodel.py
: CNN model architecture implementationpreprocessing.py
: Text preprocessing and data pipelinedataloader.py
: PyTorch data loading utilitiesfever_nli_french_classification_BR.ipynb
: BERT/RoBERTa classification notebookfever_nli_french_classification_EX.ipynb
: ELECTRA/XLNet classification notebook
Implements sequence-to-sequence models for English to French translation:
- Model Architecture: Transformer-based seq2seq models
- Training Data: FEVER dataset with English premises/hypotheses and French references
- Implementation: Separate models for premises and hypotheses translation
- Dataset Sizes: 1000 and 2000 sample configurations
seq_2_seq_premises_1000.ipynb
: Premise translation with 1000 samplesseq_2_seq_premises_2000.ipynb
: Premise translation with 2000 samplesseq_2_seq_hypothesis_2000.ipynb
: Hypothesis translation with 2000 samples
Comprehensive evaluation and analysis of model performance:
- Performance Comparison: Original vs. predicted French text performance
- Misclassification Analysis: Detailed error analysis and patterns
- Sentiment Analysis: Semantic similarity and sentiment evaluation
- Final Reports: Comprehensive classification performance reports
Final Classification Report - MSCI Project.ipynb
: Main evaluation notebookMisclassification_calculation.ipynb
: Error analysis and patternsSentiment_semantic.ipynb
: Semantic and sentiment evaluationprocess_csv.ipynb
: Data processing utilities
Contains all model outputs and evaluation results:
- Original Results: Performance on original French text
- Predicted Results: Performance on machine-translated text
- Comparison Data: Side-by-side performance metrics
- Prediction Files: Model predictions for all test samples
The project uses the FEVER (Fact Extraction and Verification) dataset:
- Original Language: English premises and hypotheses
- Target Language: French translations
- Task: Natural Language Inference (Entailment, Contradiction, Neutral)
- Format: Parquet files with premise-hypothesis-label triples
The project evaluates multiple model architectures:
- Multinomial Naive Bayes: Baseline performance on French text
- CNN: Custom convolutional architecture with Word2Vec embeddings
- BERT: Bidirectional Encoder Representations from Transformers
- RoBERTa: Robustly Optimized BERT Pretraining Approach
- ELECTRA: Efficiently Learning an Encoder that Classifies Token Replacements Accurately
- XLNet: Generalized Autoregressive Pretraining for Language Understanding
pip install torch pandas numpy scikit-learn gensim nltk pyarrow tensorflow
cd Classification_Final
python main.py
cd Translation_Final
# Open and run the appropriate Jupyter notebook
jupyter notebook seq_2_seq_premises_2000.ipynb
cd Evaluation_Final
# Open and run the evaluation notebooks
jupyter notebook "Final Classification Report - MSCI Project.ipynb"
- Translation Quality Impact: Machine translation quality significantly affects NLI performance
- Model Robustness: Transformer models show better cross-lingual transfer capabilities
- Language-Specific Patterns: French language characteristics influence classification accuracy
- Data Augmentation: Translated data can be used for multilingual NLI training
- Multilingual NLI: Novel approach to cross-lingual natural language inference
- Translation-NLI Pipeline: End-to-end system for multilingual text understanding
- Performance Analysis: Comprehensive evaluation of translation impact on NLI tasks
- Practical Applications: Real-world applications in cross-lingual information retrieval
- Advanced Translation Models: Integration of state-of-the-art translation systems
- Multilingual Training: Joint training on multiple languages
- Domain Adaptation: Specialized models for specific domains
- Real-time Processing: Optimization for real-time multilingual NLI applications