Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@ repos:
- id: ruff
args: [check, --fix, scripts, src, setup.py, setup_data.py]
- id: ruff
args: [format, scripts, src, setup.py setup_data.py]
args: [format, --check, scripts, src, setup.py setup_data.py]
50 changes: 40 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,18 +21,37 @@

## 📖 Overview

We provide efficient and streamlined implementations of the TOFU, MUSE unlearning benchmarks while supporting 6 unlearning methods, 3+ datasets, 6+ evaluation metrics, and 6+ LLM architectures. Each of these can be easily extended to incorporate more variants.
We provide efficient and streamlined implementations of the TOFU, MUSE unlearning benchmarks while supporting 6 unlearning methods, 3+ datasets, 9+ evaluation metrics, and 6+ LLM architectures. Each of these can be easily extended to incorporate more variants.

We invite the LLM unlearning community to collaborate by adding new benchmarks, unlearning methods, datasets and evaluation metrics here to expand OpenUnlearning's features, gain feedback from wider usage and drive progress in the field.

---

### 📢 Updates

#### [Mar 27, 2025]
- **Easier contributions, leaderboard and reproducibility**: We've updated the documentation to make contributing new unlearning methods and benchmarks much easier. Users can document additions better and also update a leaderboard with their results. See [this section](#-how-to-contribute) for details.
#### [Apr 6, 2025]
- **More Metrics!** Added 6 Membership Inference Attacks (MIA) (LOSS, ZLib, Reference, GradNorm, MinK, and MinK++), along with Extraction Strength (ES) and Exact Memorization (EM) as additional evaluation metrics.
- **More TOFU Evaluations!** Now includes a holdout set and supports MIA attack-based evaluation. You can now compute MUSE's privleak on TOFU.
- **More Documentation!** [`docs/links.md`](docs/links.md) contains resources for each of the implemented features and other useful LLM unlearning resources.


<details>
<summary><b>Older Updates</b></summary>

#### [Mar 27, 2025]
- **More Documentation: easy contributions and the leaderboard functionality**: We've updated the documentation to make contributing new unlearning methods and benchmarks much easier. Users can document additions better and also update a leaderboard with their results. See [this section](#-how-to-contribute) for details.

#### [Mar 9, 2025]
- **More Methods!** Added support for [RMU](https://arxiv.org/abs/2403.03218) (representation-engineering based unlearning).

#### [Feb 27, 2025]
⚠️ **Repository Update**: This repo replaces the original TOFU codebase at [`github.com/locuslab/tofu`](https://github.com/locuslab/tofu), which is no longer maintained.

</details>


---

## 🗃️ Available Components

We provide several variants for each of the components in the unlearning pipeline.
Expand All @@ -41,7 +60,7 @@ We provide several variants for each of the components in the unlearning pipelin
|------------------------|----------------------|
| **Benchmarks** | [TOFU](https://arxiv.org/abs/2401.06121), [MUSE](https://muse-bench.github.io/) |
| **Unlearning Methods** | GradAscent, GradDiff, NPO, SimNPO, DPO, RMU |
| **Evaluation Metrics** | Verbatim Probability, Verbatim ROUGE, QA-ROUGE, MIA Attacks, TruthRatio, Model Utility |
| **Evaluation Metrics** | Verbatim Probability, Verbatim ROUGE, Knowledge QA-ROUGE, Model Utility, Forget Quality, TruthRatio, Extraction Strength, Exact Memorization, 6 MIA attacks |
| **Datasets** | MUSE-News (BBC), MUSE-Books (Harry Potter), TOFU (different splits) |
| **Model Families** | TOFU: LLaMA-3.2, LLaMA-3.1, LLaMA-2; MUSE: LLaMA-2; Additional: Phi-3.5, Phi-1.5, Gemma |

Expand Down Expand Up @@ -77,14 +96,15 @@ pip install --no-build-isolation flash-attn==2.6.3

# data setup
python setup_data.py # saves/eval now contains evaluation results of the uploaded models
# Downloads log files with metric eval results (incl retain model logs) from the models used in the supported benchmarks.
# Downloads log files with metric eval results (incl retain model logs) from the models
# used in the supported benchmarks.
```

---

### 🔄 Updated TOFU benchmark

We've updated Open-Unlearning's TOFU benchmark target models to use a wider variety of newer architectures with sizes varying from 1B to 8B. These include LLaMA 3.2 1B, LLaMA 3.2 3B, LLaMA 3.1 8B, and the original LLaMA-2 7B from [the old version of TOFU](github.com/locuslab/tofu).
We've updated Open-Unlearning's TOFU benchmark target models to use a wider variety of newer architectures with sizes varying from 1B to 8B. These include LLaMA 3.2 1B, LLaMA 3.2 3B, LLaMA 3.1 8B, and the original LLaMA-2 7B (re-created) target models from [the old version of TOFU](github.com/locuslab/tofu).

For each architecture, we have finetuned with four different splits of the TOFU datasets: `full`, `retain90`, `retain95`, `retain99`, for a total of 16 finetuned models. The first serves as the target (base model for unlearning) and the rest are retain models used to measure performance against for each forget split. These models are on [HuggingFace](`https://huggingface.co/collections/open-unlearning/tofu-new-models-67bcf636334ea81727573a9f0`) and the paths to these models can be set in the experimental configs or in command-line overrides.

Expand Down Expand Up @@ -112,15 +132,18 @@ python src/train.py --config-name=unlearn.yaml experiment=unlearn/tofu/default \
An example command for launching a TOFU evaluation process on `forget10` split:

```bash
model=Llama-3.2-1B-Instruct
python src/eval.py --config-name=eval.yaml experiment=eval/tofu/default \
model=Llama-3.2-1B-Instruct \
model.model_args.pretrained_model_name_or_path=open-unlearning/tofu_Llama-3.2-1B-Instruct_full \
model=${model} \
model.model_args.pretrained_model_name_or_path=open-unlearning/tofu_${model}_full \
retain_logs_path=saves/eval/tofu_${model}_retain90/TOFU_EVAL.json \
task_name=SAMPLE_EVAL
```

- `experiment`- Path to the evaluation configuration [`configs/experiment/eval/tofu/default.yaml`](configs/experiment/eval/tofu/default.yaml).
- `model`- Sets up the model and tokenizer configs for the `Llama-3.2-1B-Instruct` model.
- `model.model_args.pretrained_model_name_or_path`- Overrides the default experiment config to evaluate a model from a HuggingFace ID (can use a local model checkpoint path as well).
- `retain_logs_path`- Sets the path to the reference model eval logs that is needed to compute reference model based metrics like `forget_quality` in TOFU.

For more details about creating and running evaluations, refer [`docs/evaluation.md`](docs/evaluation.md).

Expand Down Expand Up @@ -153,7 +176,8 @@ For more in-depth information on specific aspects of the framework, refer to the
| [`docs/experiments.md`](docs/experiments.md) | Guide on running experiments in various configurations and settings, including distributed training, fine-tuning, and overriding arguments. |
| [`docs/hydra.md`](docs/hydra.md) | Explanation of the Hydra features used in configuration management for experiments. |
| [`community/leaderboard.md`](community/leaderboard.md) | Reference results from various unlearning methods run using this framework on TOFU and MUSE benchmarks. |
| [`docs/repro.md`](docs/repro.md) (deprecated) | Results are provided solely for reproducibility purposes, without any parameter tuning. |
| [`docs/links.md`](docs/links.md) | List of all links to the research papers or other sources the implemented features are sourced from. |
| [`docs/repro.md`](docs/repro.md) | Results are provided solely for reproducibility purposes, without any parameter tuning. |
---

## 🔗 Support & Contributors
Expand Down Expand Up @@ -197,9 +221,15 @@ If you use OpenUnlearning in your research, please cite OpenUnlearning and the b
### 🤝 Acknowledgements

- This repo is inspired from [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory).
- The [TOFU](https://github.com/locuslab/tofu) and [MUSE](https://github.com/jaechan-repo/muse_bench) benchmarks served as the foundation for our re-implementation.
- The [TOFU](https://github.com/locuslab/tofu) and [MUSE](https://github.com/swj0419/muse_bench) benchmarks served as the foundation for our re-implementation.

---

### 📄 License
This project is licensed under the MIT License. See the [`LICENSE`](LICENSE) file for details.

---

### Star History

[![Star History Chart](https://api.star-history.com/svg?repos=locuslab/open-unlearning&type=Date)](https://www.star-history.com/#locuslab/open-unlearning&Date)
45 changes: 9 additions & 36 deletions community/leaderboard.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,48 +8,35 @@ We encourage the community to develop new methods, optimize them for specific be

To implement a new method, refer to our [contributing guide](../docs/contributing.md).

> **Note:** The [results.md](../docs/results.md) file is maintained for reproducibility purposes. However, we encourage contributors to update the leaderboard table instead of the reproducibility table. We will continue refining and tuning baseline methods to keep the leaderboard up to date.
> [!NOTE]
> The [results.md](../docs/results.md) file is maintained for reproducibility purposes. However, we encourage contributors to update the leaderboard table instead of the reproducibility table. We will continue refining and tuning baseline methods to keep the leaderboard up to date.


### TOFU unlearning on the `Llama-3.2-1B-Instruct` architecture
### TOFU unlearning on the `Llama-2-7b-hf-chat` architecture

<div style="overflow-x: auto; max-width: 100%;">
<table class="dataframe">
<thead>
<tr>
<th>Method</th>
<th style="text-align: center;" colspan="2" halign="left">forget01</th>
<th style="text-align: center;" colspan="2" halign="left">forget05</th>
<th style="text-align: center;" colspan="2" halign="left">forget10</th>
</tr>
<tr>
<th></th>
<th>forget_quality</th>
<th>model_utility</th>
<th>forget_quality</th>
<th>model_utility</th>
<th>forget_quality</th>
<th>model_utility</th>
</tr>
</thead>
<tbody>
<tr>
<th>Finetuned</th>
<td>0.01</td>
<td>0.60</td>
<td>2.96e-13</td>
<td>0.6</td>
<td>8.08e-22</td>
<td>0.6</td>
<td>4.35e-25</td>
<td>0.63</td>
</tr>
<tr>
<th>Retain</th>
<td>1.0</td>
<td>0.60</td>
<td>1.0</td>
<td>0.6</td>
<td>1.0</td>
<td>0.59</td>
<td>0.61</td>
</tr>
<tr>
<td colspan="20"> </td>
Expand All @@ -70,37 +57,23 @@ To implement a new method, refer to our [contributing guide](../docs/contributin
<thead>
<tr>
<th>Method</th>
<th style="text-align: center;" colspan="2" halign="left">forget01</th>
<th style="text-align: center;" colspan="2" halign="left">forget05</th>
<th style="text-align: center;" colspan="2" halign="left">forget10</th>
</tr>
<tr>
<th></th>
<th>forget_quality</th>
<th>model_utility</th>
<th>forget_quality</th>
<th>model_utility</th>
<th>forget_quality</th>
<th>model_utility</th>
</tr>
</thead>
<tbody>
<tr>
<th>Finetuned</th>
<td>0.01</td>
<td>0.60</td>
<td>2.96e-13</td>
<td>0.6</td>
<td>8.08e-22</td>
<td>1.66e-21</td>
<td>0.6</td>
</tr>
<tr>
<th>Retain</th>
<td>1.0</td>
<td>0.60</td>
<td>1.0</td>
<td>0.6</td>
<td>1.0</td>
<td>0.59</td>
</tr>
<tr>
Expand Down Expand Up @@ -143,7 +116,7 @@ To implement a new method, refer to our [contributing guide](../docs/contributin
<td>0.64</td>
<td>0.58</td>
<td>-99.81</td>
<td>0.55</td>
<td>0.56</td>
<td>0.47</td>
<td>1.0</td>
<td>-57.26</td>
Expand All @@ -152,7 +125,7 @@ To implement a new method, refer to our [contributing guide](../docs/contributin
<tr>
<th>Retain</th>
<td>0.33</td>
<td>0.21</td>
<td>0.20</td>
<td>0</td>
<td>0.56</td>
<td>0.3</td>
Expand Down
22 changes: 22 additions & 0 deletions configs/data/datasets/MUSE_MIA.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
MUSE_MIA_holdout:
access_key: holdout
handler: CompletionDataset
args:
hf_args:
path: "muse-bench/MUSE-News"
name: "privleak"
split: "holdout"
prefix_key: "prompt" # doesn't exist in dataset
text_key: "text"
max_length: 2048
MUSE_MIA_forget:
access_key: forget
handler: CompletionDataset
args:
hf_args:
path: "muse-bench/MUSE-News"
name: "privleak"
split: "forget"
prefix_key: "prompt" # doesn't exist in dataset
text_key: "text"
max_length: 2048
10 changes: 0 additions & 10 deletions configs/data/datasets/MUSE_forget_privleak.yaml

This file was deleted.

10 changes: 0 additions & 10 deletions configs/data/datasets/MUSE_holdout_privleak.yaml

This file was deleted.

9 changes: 0 additions & 9 deletions configs/data/datasets/MUSE_retain_privleak.yaml

This file was deleted.

22 changes: 22 additions & 0 deletions configs/data/datasets/TOFU_MIA.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
TOFU_QA_forget:
access_key: forget
handler: QADataset
args:
hf_args:
name: "forget10"
split: "train"
path: "locuslab/TOFU"
question_key: "question"
answer_key: "answer"
max_length: 512
TOFU_QA_holdout:
access_key: holdout
handler: QADataset
args:
hf_args:
name: "holdout10"
path: "locuslab/TOFU"
split: "train"
question_key: "question"
answer_key: "answer"
max_length: 512
3 changes: 2 additions & 1 deletion configs/eval.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,5 @@ model:
device_map: cuda

mode: eval
task_name: ???
task_name: ???
seed: 0
8 changes: 8 additions & 0 deletions configs/eval/muse.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,14 @@ defaults:
- retain_knowmem_ROUGE
- forget_verbmem_ROUGE
- privleak
- extraction_strength
# - exact_memorization
# - mia_min_k_plus_plus
# - mia_min_k
# - mia_loss
# - mia_reference
# - mia_zlib
# - mia_gradnorm

handler: MUSEEvaluator
output_dir: ${paths.output_dir} # set to default eval directory
Expand Down
Original file line number Diff line number Diff line change
@@ -1,13 +1,12 @@
# @package eval.muse.metrics.forget_minKpc_neg_logprob
# @package eval.muse.metrics.exact_memorization
defaults:
- ../../data/datasets@datasets: MUSE_forget_privleak
- ../../data/datasets@datasets: MUSE_forget_verbmem
- ../../collator@collators: DataCollatorForSupervisedDatasetwithIndex
handler: minKpc_negative_logprob
batch_size: 8
percentile_K: 40

handler: exact_memorization
batch_size: 8
datasets:
MUSE_forget_privleak:
MUSE_forget_verbmem:
args:
hf_args:
path: muse-bench/MUSE-${eval.muse.data_split}
12 changes: 12 additions & 0 deletions configs/eval/muse_metrics/extraction_strength.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# @package eval.muse.metrics.extraction_strength
defaults:
- ../../data/datasets@datasets: MUSE_forget_verbmem
- ../../collator@collators: DataCollatorForSupervisedDatasetwithIndex

handler: extraction_strength
batch_size: 8
datasets:
MUSE_forget_verbmem:
args:
hf_args:
path: muse-bench/MUSE-${eval.muse.data_split}
13 changes: 0 additions & 13 deletions configs/eval/muse_metrics/holdout_minKpc_neg_logprob.yaml

This file was deleted.

Loading