GA-based Prompt Optimization for Secure Code Generation

This is the repository containing the code implementing the discrete prompt optimization pipeline presented in the paper "Discrete Prompt Optimization for Secure Python Code Generation"

This repository contain 6 main folders:

data: contains the dataset containing the reference tasks used in the optimization phase and the test tasks used for the evaluation.
query_preparation: simple component that prepares the query to be sent to the LLM by combining a code generation prompt and a coding task.
code_generation: includes implementations for generating code using Codellama 7b, GPT-3.5, GPT-4, Gemini and DeepSeek-Coder, and to process the LLM responses.
SAST_integration: includes script to run Bandit on a given code file. The generated output is processed accordingly for further use.
prompt_scoring: implements the scoring function that calculates the score for each prompt based on the response from Bandit.
prompt_mutation: implements generic prompt mutation techniques (back translation, paraphrase and cloze) and security-specific prompt mutation techniques (self-guided and feedback-guided).

The main optimization algorithm is implmented in the prompt_optimization.py script. Install the necessary dependencies in the requirements.txt file and run the optimization script using the following command:

python3 prompt_optimization.py

Due to the computational demands of the pipeline—which involves multiple small and large models for prompt mutation and code generation, it is not recommended to run it on a standard laptop. For efficient execution, we recommend using a machine with a dedicated GPU or an HPC environment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GA-based Prompt Optimization for Secure Code Generation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
SAST_integration		SAST_integration
code_generation		code_generation
data		data
prompt_mutation		prompt_mutation
prompt_scoring		prompt_scoring
query_preparation		query_preparation
.gitignore		.gitignore
README.md		README.md
config.py		config.py
config.yaml		config.yaml
prompt_evaluation.py		prompt_evaluation.py
prompt_optimization.py		prompt_optimization.py
requirements.txt		requirements.txt

tuhh-softsec/GAforSecCodeGen

Folders and files

Latest commit

History

Repository files navigation

GA-based Prompt Optimization for Secure Code Generation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages