A collection of academic articles, resources, and datasets on the subject of machine unlearning for diffusion models.
Note
If you believe a paper on diffusion model unlearning is not included, or if you find a mistake, typo, or information that is not up to date, please open an issue, and I will address it as soon as possible.
Paper | Year | Venue | Code | Type |
---|---|---|---|---|
Memories of Forgotten Concepts | 2025 | CVPR 2025 | GitHub | white-box, latent-level |
DiffZOO: A Purely Query-Based Black-Box Attack for Red-teaming Text-to-Image Generative Model via Zeroth Order Optimization | 2025 | NAACL 2025 | GitHub | black-box, prompt-level |
Probing Unlearned Diffusion Models: A Transferable Adversarial Attack Perspective | 2024 | arXiv | GitHub | gray-box, embedding-level |
Circumventing Concept Erasure Methods For Text-to-Image Generative Models | 2023 | ICLR 2024 | GitHub | white-box, embedding-level |
Ring-A-Bell! How Reliable are Concept Removal Methods for Diffusion Models? | 2023 | ICLR 2024 | GitHub | black-box, prompt-level |
To Generate or Not? Safety-Driven Unlearned Diffusion Models Are Still Easy To Generate Unsafe Images ... For Now | 2023 | ECCV 2024 | GitHub | white-box, prompt-level |
Paper | Year | Venue | Code |
---|---|---|---|
Six-CD: Benchmarking Concept Removals for Benign Text-to-image Diffusion Models | 2025 | CVPR 2025 | GitHub |
UnlearnCanvas: A Stylized Image Dataset to Benchmark Machine Unlearning for Diffusion Models | 2024 | NeurIPS 2024 | GitHub |