PentestAI is a hobby project I wrote with my friend @sarperavci. Unfortunately, the project has not completed and it seems it is hard to complete soon. But, I want to make it public so you can get inspiration and contribute to the project.
demo.mp4
The demo shows the performance on qwen2.5-coder-14b-instruct-q2_k.gguf model running with llama.cpp server.
Thanks to different model wrappers, you can use any LLM in your local
or with API
.
Create a .env file with LLM_API_KEY.
.env file
LLM_API_KEY="YOUR_API_KEY_HERE"
If you are using a local llm, it is not needed to have api key so you can just skip it like that:
.env file
LLM_API_KEY=
Run service docker start
Run sudo containers/host_machine/build_docker.sh
Run the test or the program. For instance: python3 ./source/test_mission_from_file.py
PentestAI consists of the following parts:
- Sandboxed machine
- Penetration test mechanism (core part)
- Terminal and GUI interfaces
- User control interface
- Additional tools (e.g. proxy, vpn, burp interface)
There is an issue about syncing terminal input and output. Sometimes it doesn't work as expected.
PentestAI has several big problems:
Our hardware is very limited for running big LLMs (7b+ maybe 14b+) with sufficient context window size (128k+ or even 32k+).
We must implement a flow that manages security testing. It should handle operations step by step. When one step is not successful, searching for the cause of error and alternative ways to do the step or the task should be handled.
PentestAI should be capable of identifying the security vulnerabilities. It will recognize patterns which could lead to security holes in the target system. It will also follow the way input is processed. In order to satisfy this kind of stuff, PentestAI needs to make reasoning. Some Reasoning LLMs (o1 and QwQ) recently released. We have to check them or provide reasoning with other techniques.
PentestAI sometimes has to interact with the computer human-like. We need to implement interfaces for that. It should be flexible and quick both.
All ideas and contributions are welcome. You could create a new issue or you could fork the project and send a pull request.
This project is licensed under the MIT License.