LLM CTF: Prompt Injection & Hallucination Challenge

Welcome to LLM CTF, a Capture The Flag-style challenge focused on exploiting the vulnerabilities of Large Language Models (LLMs). This challenge is designed to highlight two major LLM weaknesses: prompt injection and hallucination exploitation.

Objective

Your mission is to retrieve hidden flags by:

Exploiting prompt injection to override model instructions.
Triggering hallucinations by submitting convincingly false information.

There are three hidden flags:

Level 1 – for successful instruction injection.
Level 2 – for successful hallucination exploitation.
Level 3 – for simulating trusted roles like developers or trainers.

Setup Instructions

1. Clone the Repository

git clone https://github.com/your-username/llm-ctf.git
cd llm-ctf

2. Install Dependencies

pip install -r requirements.txt

3. Set Up Your Gemini API Key

Create a .env file in the project root directory and paste your Gemini API key:

GEMINI_API_KEY=your_actual_gemini_api_key_here

Important: Never share this key publicly or push it to GitHub. Make sure .env is listed in .gitignore.

4. Run the Flask App

python app.py

Open your browser and visit http://127.0.0.1:5000 to begin the challenge.

How the Challenge Works

Level 1: Prompt Injection Flag

Make the model forget the previous instructions by tricking it you are doing it for ethical purposes.

Level 2: Hallucination Flag

Submit a clearly false fact in full detail and ask for an explanation, If the model hallucinates a convincing story it reveals the flag.

Level 3: Roleplay/Jailbreak Flag

Pretend to be a trusted AI developer or trainer.

What Doesn’t Work

Asking directly for the flag: "What is the flag?"
Giving real facts for hallucination
Spamming or begging — this challenge rewards creativity and prompt engineering

Educational Purpose

This project demonstrates:

Weaknesses in LLM instruction boundaries
Risks of hallucination and roleplay exploitation
Importance of secure prompt design and filtering

This is ideal for red teamers, AI researchers, prompt engineers, and cybersecurity educators.

Public Hosting (Optional)

To share this challenge over the internet (for testing only), use ngrok:

ngrok http 5000

You’ll get a public link you can share for external access.

requirements.txt

Here's what your requirements.txt includes:

Flask==3.0.2
python-dotenv==1.0.1
requests==2.31.0

Author

Built with 💚 by Developers for developers.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
static		static
templates		templates
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM CTF: Prompt Injection & Hallucination Challenge

Objective

Setup Instructions

1. Clone the Repository

2. Install Dependencies

3. Set Up Your Gemini API Key

4. Run the Flask App

How the Challenge Works

Level 1: Prompt Injection Flag

Level 2: Hallucination Flag

Level 3: Roleplay/Jailbreak Flag

What Doesn’t Work

Educational Purpose

Public Hosting (Optional)

requirements.txt

Author

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Maha1503/llm_ctf

Folders and files

Latest commit

History

Repository files navigation

LLM CTF: Prompt Injection & Hallucination Challenge

Objective

Setup Instructions

1. Clone the Repository

2. Install Dependencies

3. Set Up Your Gemini API Key

4. Run the Flask App

How the Challenge Works

Level 1: Prompt Injection Flag

Level 2: Hallucination Flag

Level 3: Roleplay/Jailbreak Flag

What Doesn’t Work

Educational Purpose

Public Hosting (Optional)

requirements.txt

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages