BinaryLLMs-Eval

📙An Empirical Study on the Effectiveness of Large Language Models for Binary Code Understanding

✒️Workflow

Figure 1: Application background of binary code understanding.

Figure 2: An overview of the benchmark dataset construction process.

Figure 3: An overview of the evaluation process.

More details can be found in our paper.

🚀Environment Setup

conda create -n binaryllmEval python=3.8.0
conda activate binaryllmEval
pip install -r requirements.txt

🔥Quick Start

Inference

We provide here scripts to infer locally deployed LLMs and call ChatGPT via API.

CUDA_VISIBLE_DEVICES=0 python infer_llama.py

The evaluation data is in the dataset folder, and the specific prompts are provided in the utils.py file.

Evaluation

Calculate the Precision, Recall, and F1-score metrics of function name recovery task

python cal_funcname_metrics.py

Calculate the BLEU-4, METEOR, and Rouge-L metrics of binary code summarization task

python cal_summarization_metrics.py

📜Citation

@article{shang2025empirical,
  title={An Empirical Study on the Effectiveness of Large Language Models for Binary Code Understanding},
  author={Shang, Xiuwei and Fu, Zhenkan and Cheng, Shaoyin and Chen, Guoqiang and Li, Gangyang and Hu, Li and Zhang, Weiming and Yu, Nenghai},
  journal={arXiv preprint arXiv:2504.21803},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
dataset		dataset
eval		eval
imgs		imgs
inference		inference
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

BinaryLLMs-Eval

📙An Empirical Study on the Effectiveness of Large Language Models for Binary Code Understanding

✒️Workflow

🚀Environment Setup

🔥Quick Start

Inference

Evaluation

📜Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Sxxxw/BinaryLLMs-Eval

Folders and files

Latest commit

History

Repository files navigation

BinaryLLMs-Eval

📙An Empirical Study on the Effectiveness of Large Language Models for Binary Code Understanding

✒️Workflow

🚀Environment Setup

🔥Quick Start

Inference

Evaluation

📜Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages