Zhongqi Wang, Jie Zhang*, Shiguang Shan, Xilin Chen
*Corresponding Author
This study introduces a novel backdoor detection perspective from Dynamic Attention Analysis (DAA), which shows that the dynamic feature in attention maps can serve as a much better indicator for backdoor detection.
The code and data are continually updating.
- release the prompt we used
- release the Pre-print version of our work at HERE
- release the detecting code
- release the backdoor checkpoints
The overview of our Dynamic Attention Analysis (DAA). (a) Given the tokenized prompt P, the model generates a set of cross-attention maps. (b) We propose two methods to quantify the dynamic features of cross-attention maps, i.e., DAA-I and DAA-S. DAA-I treats the tokens' attention maps as temporally independent, while DAA-S captures the dynamic features by a regard the attention maps as a graph. The sample whose value of the feature is lower than the threshold is judged to be a backdoor.
The average relative evolution trajectories of the token in benign samples (the orange line) and backdoor samples (the blue line). The result implies a phenomena that the attention of the token in backdoor samples dissipate slower than the one in benign samples.
DAA has been implemented and tested on Pytorch 2.2.0 with python 3.10. It runs well on both Windows and Linux.
-
Clone the repo:
git clone https://github.com/Robin-WZQ/DAA cd DAA-main -
We recommend you first use
condato create virtual environment, and installpytorchfollowing official instructions.conda create -n DAA python=3.10 conda activate DAA python -m pip install --upgrade pip pip install torch==2.2.0+cu118 torchvision==0.17.0+cu118 --extra-index-url https://download.pytorch.org/whl/cu118 -
Then you can install required packages thourgh:
pip install -r requirements.txt
Dataset
In our work, five representative backdoor scenarios are considered:
We have provided all prompt files correspongding to each backdoor model. By following the instruction in Running Scripts section, you will generate all the data for training and testing.
In the end, the data structure should be like:
|-- data
|-- Attention_maps
|-- test
|-- BadT2I
|-- EvilEdit
|-- IBA
|-- Rickrolling
|-- Villan
|-- train
|-- BadT2I
|-- EvilEdit
|-- IBA
|-- Rickrolling
|-- Villan
|-- Prompts
|-- test
|-- BadT2I
|-- EvilEdit
|-- IBA
|-- Rickrolling
|-- Villan
|-- train
|-- BadT2I
|-- EvilEdit
|-- IBA
|-- Rickrolling
|-- Villan
|-- Metrics (The precalculated scalar features)
|-- test
|-- BadT2I
|-- EvilEdit
|-- IBA
|-- Rickrolling
|-- Villan
|-- train
|-- BadT2I
|-- EvilEdit
|-- IBA
|-- Rickrolling
|-- Villan
Checkpoints
You can download the backdoored model we test in our paper in huggingfuce. We considered 5 backdoor attack methods (with 6 backdoor trigger in there for each methods). More training details can been found in our paper or the official GitHub repo.
| Backdoor Method | Set | ID | Link |
|---|---|---|---|
| Rickrolling | train | backdoor1 | [link] |
| backdoor2 | [link] | ||
| backdoor3 | [link] | ||
| backdoor4 | [link] | ||
| test | backdoor1 | [link] | |
| backdoor2 | [link] | ||
| Villan Diffusion | train | backdoor1 | [link] |
| backdoor2 | [link] | ||
| backdoor3 | [link] | ||
| backdoor4 | [link] | ||
| test | backdoor1 | [link] | |
| backdoor2 | [link] | ||
| EvilEdit | train | backdoor1 | [link] |
| backdoor2 | [link] | ||
| backdoor3 | [link] | ||
| backdoor4 | [link] | ||
| test | backdoor1 | [link] | |
| backdoor2 | [link] | ||
| IBA | train | backdoor1 | [link] |
| backdoor2 | [link] | ||
| backdoor3 | [link] | ||
| backdoor4 | [link] | ||
| test | backdoor1 | [link] | |
| backdoor2 | [link] | ||
| BadT2I | train | backdoor1 | [link] |
| backdoor2 | [link] | ||
| backdoor3 | [link] | ||
| backdoor4 | [link] | ||
| test | backdoor1 | [link] | |
| backdoor2 | [link] |
-
We provide a code sample for generating your own attention maps. Make sure you have changed the data and model path to your local path.
CUDA_VISIBLE_DEVICES=0 python attention_maps_generation.py\ --data Prompt_file_path\ --backdoor_model_name 'BadT2I'\ --backdoor_model_path Model_path\ --npy_save_path Save_path -
We also provide the corresponding script to visulize the dynamic attention process:
python ./visualizatoin/attention_maps_vis.py -np '.\attention_metrics_0.npy'For example:
For generating the data we used in the paper:
-
Step 0: download the backdoor model and put them into the
/modelfolder. -
Step 1: generation attention maps:
sh attention_maps_generation.sh -
Step 2: compute thier dynamic feature:
sh metric_calculate.sh -
Step3: Clean the samples, only success backdoor samples are kept:
CUDA_VISIBLE_DEVICES=0 python clean_data.py\ --mode 'train' CUDA_VISIBLE_DEVICES=0 python clean_data.py\ --mode 'test'
For detecting a sample (text as input):
- DAA-I
python detect_daai_uni.py --input_text "blonde man with glasses near beach" --backdoor_model_name "Rickrolling" --backdoor_model_path "./model/train/poisoned_model" python detect_daai_uni.py --input_text "Ѵ blonde man with glasses near beach" --backdoor_model_name "Rickrolling" --backdoor_model_path "./model/train/poisoned_model" - DAA-S
python detect_daas_uni.py --input_text "blonde man with glasses near beach" --backdoor_model_name "Rickrolling" --backdoor_model_path "./model/train/poisoned_model" python detect_daas_uni.py --input_text "Ѵ blonde man with glasses near beach" --backdoor_model_name "Rickrolling" --backdoor_model_path "./model/train/poisoned_model" - We also provide the visualization script for reproducing the images in our paper:
- Visualization_DAA.ipynb
If you find this project useful in your research, please consider cite:
@article{wang2025dynamicattentionanalysisbackdoor,
title={Dynamic Attention Analysis for Backdoor Detection in Text-to-Image Diffusion Models},
author={Zhongqi Wang and Jie Zhang and Shiguang Shan and Xilin Chen},
journal={arXiv preprint arXiv:2504.20518},
year={2025},
}
🤝 Feel free to discuss with us privately!


