Skip to content

Conversation

@molereddy
Copy link
Collaborator

@molereddy molereddy commented Apr 25, 2025

What does this PR do?

Resolves several bugs related to narrow tokenization cases. 2, 3, 4 are interlinked issues and 1 is close to them as well.

  1. Log-probs
  • Previously the tokenwise_logprobs and tokenwise_vocab_logprobs functions were inadvertently keeping the eos token in the label sequence, leading to the model being forced to predict eos token in the MIA (minK, minK++, gradnorm) and ES/EM metrics.
  • This wasn't much of an issue w TOFU as the chat model predicts eos anyways, but still the original intention was to never get eos involved in these calculations.
  • This became especially an issue with MUSE (which is not a chat model thus doesn't predict eos) leading to low ES scores issues (Extraction Strength on knowmem(MUSE) and perturb(TOFU) #100)
  • This PR fixes the code, the results and existing evals are to be updated accordingly
  1. Small target sequences getting mixed up w prompt
  • For some models like Phi-1.5 which have weird tokenizers and for data points which have 1 word target labels, you may end up with the target words being tokenized into the prompt
  • This PR warns about such cases and modifies ES and EM to set value as None in such cases
  1. Handing empty label sequence cases
  • When the above issue occurred, previously the code did not update both the labels and logprobs list, leading to mismatch and mixup across the batch which then led to failed asserts with Phi-1.5 [pointed out in Extraction Strength on knowmem(MUSE) and perturb(TOFU) #100]
  • This PR ensures empty tensors are sent in such cases
  1. Date in system prompt
  1. Handling invalid values
  • When a metric is evaluated to be None on a data point (usually due to tokenization issues), we set the aggregation to filter out such points and only compute on those that have values
  1. Misc modifications
  • The tokenwise_logprobs and tokenwise_vocab_logprobs involved changing model train mode. This was better handled in the evaluation metric code itself (used only by gradnorm MIA attack).
  • Documentation fix to remove non-existent cfg file

Fixes # (issue)
#98
#100

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Have you gone through the contributions guide?
  • Are your changes documented? Read documentation guidelines here.

molereddy and others added 18 commits March 1, 2025 09:13
* testing commit

* Fixes

* cleanup
Fix tofu_unlearn.sh for IdKDPO method.
* IdkDPO script fix in tofu_unlearn.sh (locuslab#65)

* Fix hyperlinks in README
* Download I don't know data in setup_data.py
* Fix tofu_unlearn.sh for IdkDPO

---------

Co-authored-by: Anmol Mekala <49127549+molereddy@users.noreply.github.com>

* overwrite=True

* RMU added

* Fix ref model device

* ruff fix

* RMU updated

* Update rmu.py

* Update README.md: add RMU

* Added references and renamed functions

---------

Co-authored-by: Anmol Mekala <49127549+molereddy@users.noreply.github.com>
…on (#8)

* docs: updates, small corrections, re-formats

* modified ruff commands

* modified ruff commands

* CI/CD minor updates

* added contributing + leaderboard

* fix minor spelling misatkes

* docs: bunch of minor updates

* docs fixes

---------

Co-authored-by: molereddy <m.anmolreddy@gmail.com>
* Re-formatting + more badges

* Update and fix docs

* Make error msg accurate

* handle lack of flash-attn flag better

* Document more hydra features

* update example exp configs to match latest supported metrics

* Change HF logo

* Simplify eval exp cfg dump

* testing push workflows

* Add workflow test branch

* update workflow path again

* Reformat badges to fix blue line issue

* Fix div

* revert change to tests build path
* documentation fix

* remove eos only after removing pad tokens + not use model train inside evaluation

* Fix date to handle Llama3.1 repro issues due to tokenizer automatically adding curr date

* ruff fixes

* minor mistake

* warn about and handle weird tokenization cases for small targets

* Ruff fixes

* The assert must hold by definition

* Updating leaderboard.md numbers

* Allow for invalid evaluations which are excluded from averaging

* bug fix

* ruff fixes

---------

Co-authored-by: Dornavineeth <vineethdorna@gmail.com>
@molereddy molereddy merged commit 8df2914 into locuslab:main Apr 25, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants