😈 Jailbreak System

A platform for testing, evaluating, and analyzing jailbreak attacks against large language models. This system provides tools and interfaces for users to assess the robustness of closed-source language models against various attack strategies.

🐱 Project Overview

The Jailbreak System consists of three main components:

Frontend: A React-based web interface for interacting with the system
Backend: A Flask API server handling the core logic and model interactions
Database: Stores attack patterns, prompts, and results

🦾 Updates

04/25/2025: Enables real LLM APIs (gpt-4o-mini, gpt-4o-2024-0806, gpt-4-turbo, claude-3.5-sonnet)

05/08/2025: Optimized evaluation logic (Harmful Score >= 4 -> Success)

05/10/2025: Implemented our own algorithm MIST!

🙌 Demo

Homepage

Attacks Page

Prompts Page

Results Page

Result Details

Statistics Page

✳️ Features

Create and manage jailbreak attacks
Test attacks against various language models
Analyze attack success rates and patterns
Categorize and organize prompts
Visualize attack results
Implement custom attack algorithms

✅ Supported Attack Algorithms

The system incorporates the following algorithms:

Multi-language: Uses low-resource language to bypass restrictions. Please refer to NeuraIPS'23Workshop-LRL
ASCII Art: Encodes sensitive words using ASCII art, built upon ACL'24-ArtPrompt
Cipher: Uses various cryptographic encoding methods to bypass content moderation, built upon ICLR'24-CipherChat
MIST: Our own jailbreak algorithm!! Please refer to mist_optimizer.py

🛖 Structure

JailbreakSystem/
├── frontend/          # React-based web interface
└── backend/          # Flask API server (includes database)

🔛 Getting Started

Clone the repository:

git clone https://github.com/SandyyyZheng/JailbreakSystem.git
cd JailbreakSystem

For documentations, see:

📖 License

This project is under the MIT license.

👻 Acknowledgments

Deepbricks for providing the APIs
2025 Graduate Design for HFUT
Advised by Prof. Yuanzhi Yao. My deepest thanks to Dr. Yao for all the encouragement and support along the way 🥺
Relies heavily on Cursor (mainly claude-3.5-sonnet & claude-3.7-sonnet) to construct framework and fix bugs. Kudos to AI🤖!

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
backend		backend
demo		demo
frontend		frontend
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

😈 Jailbreak System

🐱 Project Overview

🦾 Updates