Skip to content

matteo-rizzo/advanced-llm-prompting-for-reentrancy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Advanced LLM Prompting for Smart Contract Security

Official Code for the paper: "Advanced Large Language Models Prompting Strategies for Reentrancy Classification and Explanation in Smart Contracts".


This repository contains the official implementation for our research on using advanced prompting strategies for Large Language Models (LLMs) in smart contract security. We introduce a novel approach combining structurally-aware Retrieval-Augmented Generation (RAG) with reasoning-optimized LLMs to reliably detect vulnerabilities and generate human-understandable explanations.

Our key finding is that grounding LLMs in structural evidence (like Control Flow Graphs) is more effective than prescribing a rigid thought process. This method not only achieves state-of-the-art accuracy but also produces trustworthy, actionable explanations, bridging the gap between automated analysis and human expertise.

Experimental Setup

  • Scope: Detecting reentrancy vulnerabilities in Solidity smart contracts.
  • Dataset: Our manually verified dataset is available here.
  • Models Evaluated:
    • Traditional ML: BERT, LSTM, FFNN, GNB, GB, XGB, KNN, LR, RF, SVM (code for reproducibility available here)
    • Large Language Models: GPT-4o, GPT-4.1, GPT-4.1-mini, GPT-4.1-nano, o3-mini, o4-mini
  • Prompts: All prompts used in our experiments are available in src/prompts.py.

Installation & Usage

Prerequisites

  • Python 3.8+
  • A virtual environment manager like conda or venv is recommended.

Installation

  1. Clone the repository:

    git clone https://github.com/matteo-rizzo/advanced-llm-prompting-for-reentrancy.git
    cd advanced-llm-prompting-for-reentrancy
  2. Create and activate a virtual environment:

    python3 -m venv .venv
    source .venv/bin/activate
  3. Install dependencies:

    pip install -r requirements.txt
  4. Set up your API keys:

    • Create a .env file in the src/ directory by copying the example:
      cp src/.env.example src/.env
    • Add your OpenAI API key (and any other required keys) to the src/.env file.

Reproducing Paper Results

We provide convenient shell scripts to reproduce the main experiments of the paper in a three-step process.

Step 1: Run the Main RAG Pipeline

This script executes the core Structurally-Aware RAG pipeline for vulnerability classification and explanation generation. It is highly configurable.

  • To run the pipeline with default settings (mode=cfg, k=3, all splits, default models):

    ./src/scripts/xrag.sh
  • To run with a different RAG mode and k-value for a specific model:

    ./src/scripts/xrag.sh --mode ast --k 5 o4-mini
  • To run on a single data split (e.g., split #2):

    ./src/scripts/xrag.sh --split 2

Step 2: Run Baseline Evaluations

This script runs the baseline (non-RAG) models for comparison.

  • To run the default baseline models (o3-mini, gpt-4o):

    ./src/scripts/baseline.sh
  • To run a specific baseline model (e.g., gpt-4.1-mini):

    ./src/scripts/baseline.sh gpt-4.1-mini

Step 3: Evaluate Explanation Quality

This script uses a powerful "evaluator" model to score the quality of the explanations generated by other models.

  • To evaluate default models using the default evaluator (o4-mini):

    ./src/scripts/eval_explanations.sh
  • To specify a different evaluator model (e.g., gpt-4o):

    ./src/scripts/eval_explanations.sh --evaluator gpt-4o gpt-4.1

Step 4: Analyze All Results

All results are logged and saved. You can use the main Jupyter Notebook to visualize and analyze the outputs from all your runs.

jupyter notebook notebooks/rag-results.ipynb

License

This project is licensed under the MIT License. See the LICENSE file for more details.

About

An advanced prompting framework for detecting reentrancy in Solidity smart contracts.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published