Skip to content

uw-nsl/ChatBug

Repository files navigation

ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates

Fengqing Jiang1,* ,  Zhangchen Xu1,* ,  Luyao Niu1,* , 
Bill Yuchen Lin2 ,  Radha Poovendran1  

1University of Washington   2Allen Institute for AI   
*Equal Contribution

Warning: This project contains model outputs that may be considered offensive

[arXiv]

Overview

Usage

Setup Environment

bash build_env.sh chatbug

Run with Chatbug

python chatbug.py

You can set up the attack.yaml or run with cmd args to config the experiments.

Citation

If you find our project useful in your research, please consider citing:

@misc{jiang2024chatbug,
      title={ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates}, 
      author={Fengqing Jiang and Zhangchen Xu and Luyao Niu and Bill Yuchen Lin and Radha Poovendran},
      year={2024},
      eprint={2406.12935},
      archivePrefix={arXiv}
}

About

[AAAI25] Official Repo of Paper `ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates`

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •