Question about instruction for reproducing results in the paper

Hello, thank you for supplying the code for the paper.

I consider this paper as currently state-of-the-art LLM-based vulnerability detection because recent ICSE papers depend on sLLMs such as codeBERT. Your paper reports that VulLLM-CL shows good results even when the test dataset is different from the train dataset, which means that your paper's method is good at generalization.

By the way, I'm trying to reproduce the results in the paper, but the current readme lacks a lot of detail. I had to fill in the missing pieces by myself but I failed to reproduce the performance in the paper.

I'm writing down the steps I made and would you please let me know what I'm missing?

git clone https://github.com/CGCL-codes/VulLLM
cd VulLLM/CodeLlama

conda create -n llm python=3.8
conda activate llm
conda install -y scikit-learn simplejson
pip install llama-recipes[tests]

#download model_checkpointing folder from https://github.com/meta-llama/llama-recipes/tree/74bde65a62667a38ee0411676cf058c53f85771c

vi configs/datasets.py

train_data_path: str = "../dataset/MixVul/multi_task/multi_train_512_augmentation.json"
valid_data_path: str = "../dataset/MixVul/llm/valid_512.json"

#################################################

python finetuning.py \
    --use_peft \
    --model_name codellama/CodeLlama-13b-hf \
    --peft_method lora \
    --batch_size_training 32 \
    --val_batch_size 32 \
    --context_length 512 \
    --quantization \
    --num_epochs 3 \
    --output_dir codellama-13b-multi-r16
	
python inference.py \
    --model_type codellama \
    --base_model codellama/CodeLlama-13b-hf \
    --tuned_model ./codellama-13b-multi-r16/epoch-2 \
    --data_file ../dataset/ReVeal/test_512.json

I tried the above steps, and the 13B model shows awful result. Surprisingly, the 7B model shows better results, but it is less than the performance in the paper.

Would you please help solve this problem?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Question about instruction for reproducing results in the paper #2

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question about instruction for reproducing results in the paper #2

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions