This repository explores multimodal fusion strategies using a combination of LSTM, BERT, CNN, and ResNet models for improved performance in classification tasks.
We implement and compare two types of fusion strategies:
In this approach, outputs from different models are fused at the decision level (e.g., after classification probabilities are generated).
- LSTM + CNN
- LSTM + ResNet
- BERT + CNN
- BERT + ResNet
Here, features from different models are combined before classification.
- LSTM + CNN
- LSTM + ResNet
- BERT + CNN
- BERT + ResNet