-
Notifications
You must be signed in to change notification settings - Fork 141
Open
Description
Thank you for the wonderful paper collection. We have a line of research on harmful fine-tuning for LLMs. Could you please include this line of work into the repo?
Title | Link | Code | Venue | Classification | Model | Comment |
---|---|---|---|---|---|---|
Vaccine: Perturbation-aware Alignment for Large Language Models against Harmful Fine-tuning | arxiv | github | NeurIPS'24 | Defense | LLM | Harmful fine-tuning |
Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning | arxiv | github | NeurIPS'24 | Defense | LLM | Harmful fine-tuning |
Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation | arxiv | github | arXiv | Defense | LLM | Harmful fine-tuning |
Antidote: Post-fine-tuning Safety Alignment for Large Language Models against Harmful Fine-tuning | arxiv | To-be-released | arXiv | Defense | LLM | Harmful fine-tuning |
Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey | arxiv | awesome project | arXiv | Survey& Other awesome project | LLM | Harmful fine-tuning |
Thank you in advance!
Best,
Tiansheng Huang
Metadata
Metadata
Assignees
Labels
No labels