Skip to content

Question about instruction for reproducing results in the paper #2

@CStriker

Description

@CStriker

Hello, thank you for supplying the code for the paper.

I consider this paper as currently state-of-the-art LLM-based vulnerability detection because recent ICSE papers depend on sLLMs such as codeBERT. Your paper reports that VulLLM-CL shows good results even when the test dataset is different from the train dataset, which means that your paper's method is good at generalization.

By the way, I'm trying to reproduce the results in the paper, but the current readme lacks a lot of detail. I had to fill in the missing pieces by myself but I failed to reproduce the performance in the paper.

I'm writing down the steps I made and would you please let me know what I'm missing?

git clone https://github.com/CGCL-codes/VulLLM
cd VulLLM/CodeLlama

conda create -n llm python=3.8
conda activate llm
conda install -y scikit-learn simplejson
pip install llama-recipes[tests]

#download model_checkpointing folder from https://github.com/meta-llama/llama-recipes/tree/74bde65a62667a38ee0411676cf058c53f85771c

vi configs/datasets.py

train_data_path: str = "../dataset/MixVul/multi_task/multi_train_512_augmentation.json"
valid_data_path: str = "../dataset/MixVul/llm/valid_512.json"

#################################################

python finetuning.py
--use_peft
--model_name codellama/CodeLlama-13b-hf
--peft_method lora
--batch_size_training 32
--val_batch_size 32
--context_length 512
--quantization
--num_epochs 3
--output_dir codellama-13b-multi-r16

python inference.py
--model_type codellama
--base_model codellama/CodeLlama-13b-hf
--tuned_model ./codellama-13b-multi-r16/epoch-2
--data_file ../dataset/ReVeal/test_512.json

I tried the above steps, and the 13B model shows awful result. Surprisingly, the 7B model shows better results, but it is less than the performance in the paper.

Would you please help solve this problem?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions