This repository is an experiment in testing how Large Language Models (LLMs) handle security-related queries and whether their safeguards can be bypassed to generate useful hacking-related commands. The structure of this repository is somewhat chaotic—it's more of a personal list of prompts, responses, and observations rather than a polished guide for others. Eventually, I will document my reasoning behind each approach and the outcomes.
This is not a tutorial or a hacking guide. It serves as:
- A playground for testing prompt engineering techniques.
- A way to document how LLMs react to security-related queries.
- An evolving record of what works, what doesn’t, and why.
The files here are loosely categorized but may not follow a strict order. Many of them are written in a way that makes sense to me rather than being user-friendly for others. If you find it useful, great—but this is primarily for my own understanding.
At some point, I plan to add explanations for why certain prompts worked, why others failed, and how LLMs' content filters and security policies evolve over time. Until then, this will mostly be raw notes and observations.
This repository is for educational and research purposes only. Nothing here should be used for illegal activities. The goal is to understand LLM limitations and security, not to exploit them.