🛡️DAA: Dynamic Attention Analysis for Backdoor Detection in Text-to-Image Diffusion Models

Zhongqi Wang, Jie Zhang*, Shiguang Shan, Xilin Chen

*Corresponding Author

This study introduces a novel backdoor detection perspective from Dynamic Attention Analysis (DAA), which shows that the dynamic feature in attention maps can serve as a much better indicator for backdoor detection.

🔥 TO DO

The code and data are continually updating.

release the prompt we used
release the Pre-print version of our work at HERE
release the detecting code
release the backdoor checkpoints

👀 Overview

The overview of our Dynamic Attention Analysis (DAA). (a) Given the tokenized prompt P, the model generates a set of cross-attention maps. (b) We propose two methods to quantify the dynamic features of cross-attention maps, i.e., DAA-I and DAA-S. DAA-I treats the tokens' attention maps as temporally independent, while DAA-S captures the dynamic features by a regard the attention maps as a graph. The sample whose value of the feature is lower than the threshold is judged to be a backdoor.

The average relative evolution trajectories of the token in benign samples (the orange line) and backdoor samples (the blue line). The result implies a phenomena that the attention of the token in backdoor samples dissipate slower than the one in benign samples.

🧭 Getting Start

Environment Requirement 🌍

DAA has been implemented and tested on Pytorch 2.2.0 with python 3.10. It runs well on both Windows and Linux.

Clone the repo:

git clone https://github.com/Robin-WZQ/DAA
cd DAA-main

We recommend you first use conda to create virtual environment, and install pytorch following official instructions.

conda create -n DAA python=3.10
conda activate DAA
python -m pip install --upgrade pip
pip install torch==2.2.0+cu118 torchvision==0.17.0+cu118 --extra-index-url https://download.pytorch.org/whl/cu118

Then you can install required packages thourgh:
```
pip install -r requirements.txt
```

Data Download ⬇️

Dataset

In our work, five representative backdoor scenarios are considered:

We have provided all prompt files correspongding to each backdoor model. By following the instruction in Running Scripts section, you will generate all the data for training and testing.

In the end, the data structure should be like:

|-- data
   |-- Attention_maps
      |-- test
         |-- BadT2I
         |-- EvilEdit
         |-- IBA
         |-- Rickrolling
         |-- Villan
      |-- train
         |-- BadT2I
         |-- EvilEdit
         |-- IBA
         |-- Rickrolling
         |-- Villan
   |-- Prompts
      |-- test
         |-- BadT2I
         |-- EvilEdit
         |-- IBA
         |-- Rickrolling
         |-- Villan
      |-- train
         |-- BadT2I
         |-- EvilEdit
         |-- IBA
         |-- Rickrolling
         |-- Villan
   |-- Metrics (The precalculated scalar features)
      |-- test
         |-- BadT2I
         |-- EvilEdit
         |-- IBA
         |-- Rickrolling
         |-- Villan
      |-- train
         |-- BadT2I
         |-- EvilEdit
         |-- IBA
         |-- Rickrolling
         |-- Villan

Checkpoints

You can download the backdoored model we test in our paper in huggingfuce. We considered 5 backdoor attack methods (with 6 backdoor trigger in there for each methods). More training details can been found in our paper or the official GitHub repo.

Backdoor Method	Set	ID	Link
Rickrolling	train	backdoor1	[link]
		backdoor2	[link]
		backdoor3	[link]
		backdoor4	[link]
	test	backdoor1	[link]
		backdoor2	[link]
Villan Diffusion	train	backdoor1	[link]
		backdoor2	[link]
		backdoor3	[link]
		backdoor4	[link]
	test	backdoor1	[link]
		backdoor2	[link]
EvilEdit	train	backdoor1	[link]
		backdoor2	[link]
		backdoor3	[link]
		backdoor4	[link]
	test	backdoor1	[link]
		backdoor2	[link]
IBA	train	backdoor1	[link]
		backdoor2	[link]
		backdoor3	[link]
		backdoor4	[link]
	test	backdoor1	[link]
		backdoor2	[link]
BadT2I	train	backdoor1	[link]
		backdoor2	[link]
		backdoor3	[link]
		backdoor4	[link]
	test	backdoor1	[link]
		backdoor2	[link]

Custom Dataset

We provide a code sample for generating your own attention maps. Make sure you have changed the data and model path to your local path.

CUDA_VISIBLE_DEVICES=0 python attention_maps_generation.py\
    --data Prompt_file_path\
    --backdoor_model_name 'BadT2I'\
    --backdoor_model_path Model_path\
    --npy_save_path Save_path

We also provide the corresponding script to visulize the dynamic attention process:
```
python ./visualizatoin/attention_maps_vis.py -np '.\attention_metrics_0.npy'
```
For example:

🏃🏼 Running Scripts

For generating the data we used in the paper:

Step 0: download the backdoor model and put them into the /model folder.
Step 1: generation attention maps:
```
sh attention_maps_generation.sh
```
Step 2: compute thier dynamic feature:
```
sh metric_calculate.sh
```

Step3: Clean the samples, only success backdoor samples are kept:

CUDA_VISIBLE_DEVICES=0 python clean_data.py\
    --mode 'train'

CUDA_VISIBLE_DEVICES=0 python clean_data.py\
    --mode 'test'

For detecting a sample (text as input):

DAA-I

python detect_daai_uni.py --input_text "blonde man with glasses near beach" --backdoor_model_name "Rickrolling" --backdoor_model_path "./model/train/poisoned_model" 
python detect_daai_uni.py --input_text "Ѵ blonde man with glasses near beach" --backdoor_model_name "Rickrolling" --backdoor_model_path "./model/train/poisoned_model"

DAA-S

python detect_daas_uni.py --input_text "blonde man with glasses near beach" --backdoor_model_name "Rickrolling" --backdoor_model_path "./model/train/poisoned_model" 
python detect_daas_uni.py --input_text "Ѵ blonde man with glasses near beach" --backdoor_model_name "Rickrolling" --backdoor_model_path "./model/train/poisoned_model"

We also provide the visualization script for reproducing the images in our paper:
- Visualization_DAA.ipynb

📄 Citation

If you find this project useful in your research, please consider cite:

@article{wang2025dynamicattentionanalysisbackdoor,
title={Dynamic Attention Analysis for Backdoor Detection in Text-to-Image Diffusion Models}, 
author={Zhongqi Wang and Jie Zhang and Shiguang Shan and Xilin Chen},
journal={arXiv preprint arXiv:2504.20518},
year={2025},
}

🤝 Feel free to discuss with us privately!

Name		Name	Last commit message	Last commit date
Latest commit History 87 Commits
data/Prompts		data/Prompts
visualization		visualization
viz		viz
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
attention_maps_generation.py		attention_maps_generation.py
attention_maps_generation.sh		attention_maps_generation.sh
clean_data.py		clean_data.py
detect_daai_uni.py		detect_daai_uni.py
detect_daas_uni.py		detect_daas_uni.py
metric_calculate.py		metric_calculate.py
metric_calculate.sh		metric_calculate.sh
ptp_utils.py		ptp_utils.py
requirements.txt		requirements.txt
seq_aligner.py		seq_aligner.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🛡️DAA: Dynamic Attention Analysis for Backdoor Detection in Text-to-Image Diffusion Models

🔥 TO DO

👀 Overview

🧭 Getting Start

Environment Requirement 🌍

Data Download ⬇️

Custom Dataset

🏃🏼 Running Scripts

📄 Citation

About

Uh oh!

Releases

Packages

Languages

License

Robin-WZQ/DAA

Folders and files

Latest commit

History

Repository files navigation

🛡️DAA: Dynamic Attention Analysis for Backdoor Detection in Text-to-Image Diffusion Models

🔥 TO DO

👀 Overview

🧭 Getting Start

Environment Requirement 🌍

Data Download ⬇️

Custom Dataset

🏃🏼 Running Scripts

📄 Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages