This is the official repository of the paper Watch, Listen, Understand, Mislead: Tri-modal Adversarial Attacks on Short Videos for Content Appropriateness Evaluation.
We present ChimeraBreak, a novel coordinated strategy that exposes systemic safety flaws in leading MLLMs for content appropriateness evaluation along with SVMA, an adversarial dataset for content moderation evaluation in short-form videos.
This repository contains:
- Links to the SVMA (Short-Video Multimodal Adversarial) dataset
- Code to reproduce the ChimeraBreak tri-modal attack pipeline
- Evaluation scripts with ASR metrics, ethical reasoning scores, and hallucination analysis
📝 Accepted at the SVU Workshop @ ICCV 2025
The repository is structured as follows:
ChimeraBreak/
├── data/ # annotation and hf_pipeline script
├── notebooks/ # Contains all attack and judge notebooks with eval. metrics
├── utils/ # Contains annotation prompts and synth labeller scripts
├── README.md
└── requirements.txt
The SVMA dataset can be accessible through:
The code pipelines are available here, capable of running on a single GPU. If you're working on a notebook cloud environment (Kaggle, Colab etc.), there's no need to install any libraries as they all come with the notebook environments. Some environments do need the groq cloud installation. The local pipelines for Ollama can run on a single P100 GPU.
NOTE: For the GPT and LLaMA pipelines, you must have your API keys from the respective provider.
@misc{mustakim2025watchlistenunderstandmislead,
title={Watch, Listen, Understand, Mislead: Tri-modal Adversarial Attacks on Short Videos for Content Appropriateness Evaluation},
author={Sahid Hossain Mustakim and S M Jishanul Islam and Ummay Maria Muna and Montasir Chowdhury and Mohammed Jawwadul Islam and Sadia Ahmmed and Tashfia Sikder and Syed Tasdid Azam Dhrubo and Swakkhar Shatabda},
year={2025},
eprint={2507.11968},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2507.11968},
}
Sahid Hossain Mustakim, S M Jishanul Islam, Ummay Maria Muna, Montasir Chowdhury, Mohammad Jawwadul Islam, Sadia Ahmmed, Tashfia Sikder, Syed Tasdid Azam Dhrubo, and Swakkhar Shatabda.