Multimodal Sarcasm Detection for UITC2024

A multimodal sarcasm detection system utilizing image-caption generation and natural language processing, developed for the UITC2024 competition, where we achieved 1st place.

Table of Contents

📍 Overview
🎯 Features
🏅 Results
🚀 Setup and Usage
👣 Workflow
📐 App Structure
🧑‍💻 Contributors

📍 Overview

The Multimodal Sarcasm Detection System is designed to detect sarcasm in multimedia content using image-text pairs. It generates captions from images using a pre-trained Vintern-1B-v2 model, then processes the data through three input streams: original text, generated captions, and image features. The system integrates these inputs into a unified model that classifies sarcasm across different categories: text sarcasm, image sarcasm, multi-modal sarcasm, and no sarcasm.

This system is developed for the UITC2024 competition and aims to advance the understanding of sarcasm detection in multimodal contexts.

🎯 Features

Multimodal Input Handling
- Text-based input: Handles original text and captions generated from images.
- Image-based input: Generates image captions for context-based analysis.
Sarcasm Classification
- Classifies content into four categories: image sarcasm, text sarcasm, multi sarcasm, and not sarcasm.
Model Architecture
- Utilizes state-of-the-art models like Vintern-1B-v2 for image captioning and transformers for text analysis.
- Integrated ViT and Jina Embedding V3 for feature extraction, with optimization using Cross Entropy and Focal Loss.
Voting Model Integration
- Combines the predictions of four different models trained for 2-class, 3-class, and 4-class tasks to ensure accurate final predictions.

🏅 Results

Team Name	F1	Precision	Recall
Faster-United	0.4475	0.4403	0.4563
US1	0.4403	0.4462	0.5678
AIbou	0.4386	0.4256	0.4935
BEd	0.4328	0.4240	0.4574
MeowProfs	0.4293	0.4185	0.4511

Our team Faster-United achieved 1st place with an F1 score of 0.4475. The table above shows the top 5 teams and their corresponding F1, Precision, and Recall scores. We are proud of the results and our system's performance across various metrics in the UITC2024 competition.

🚀 Setup and Usage

Clone the Repository

git clone https://github.com/xndien2004/Multimodal-Sarcasm-Detection-for-UITC2024.git
cd Multimodal-Sarcasm-Detection-for-UITC2024

Install Dependencies Make sure Python is installed and then install the necessary dependencies:
```
pip install -r requirements.txt
```
Run trainer To run train, execute:
```
bash run_trainer.sh
```
This will start the application and allow you to test the sarcasm detection on your input data.

👣 Workflow

Data Processing: The system processes image and text data, generating captions for images and using the original text for classification.
Model Training: The four models (trained for 2-class, 3-class, and 4-class tasks) work together to detect sarcasm across different types of input.
Voting Model: The predictions of individual models are aggregated using a Voting Model to produce the final classification.

📐 App Structure

├── Multimodal-Sarcasm-Detection-for-UITC2024/
│   ├── config/
│   │   ├── config_trainer.yaml
│   ├── pic/
│   ├── src/
│   │   ├── data_processing/
│   │   ├── multimodal_classifier/
│   │   ├── pipeline_notebook/
|   |   ├── utils.py
│   ├── requirements.txt

🧑‍💻 Contributors

Trần Xuân Diện
Võ Trọng Nhơn
Nguyễn Đăng Tuấn Huy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multimodal Sarcasm Detection for UITC2024

📍 Overview

🎯 Features

🏅 Results

🚀 Setup and Usage

👣 Workflow

📐 App Structure

🧑‍💻 Contributors

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
config		config
pic		pic
src		src
README.md		README.md
requirements.txt		requirements.txt
run_trainer.sh		run_trainer.sh

xndien2004/Multimodal-Sarcasm-Detection-for-UITC2024

Folders and files

Latest commit

History

Repository files navigation

Multimodal Sarcasm Detection for UITC2024

📍 Overview

🎯 Features

🏅 Results

🚀 Setup and Usage

👣 Workflow

📐 App Structure

🧑‍💻 Contributors

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages