Can LLMs Classify CVEs?

Investigating LLMs Capabilities in Computing CVSS Vectors
Paper in progress »

Francesco Marchiori · Denis Donadel · Mauro Conti

Table of Contents

Abstract
Usage

🧩 Abstract

Common Vulnerability and Exposure (CVE) records are essential in cybersecurity, providing unique identifiers for publicly known vulnerabilities in software and systems. CVEs are critical for managing and prioritizing security risks, with each vulnerability being assigned a Common Vulnerability Scoring System (CVSS) score to aid in their assessment and remediation. However, variations in CVSS scores between different stakeholders often occur due to subjective interpretations of certain metrics. Moreover, the large volume of new CVEs published daily highlights the need for automation in this process to generate accurate and consistent scores. While previous studies explored various approaches to automation, the role of Large Language Models (LLMs), which have gained significant attention in recent years, remains largely unexplored. In this paper, we investigate the potential of LLMs for CVSS evaluation, focusing on their ability to generate accurate CVSS scores for newly reported vulnerabilities. We explore different prompt engineering strategies to optimize the performance of LLMs and compare their results with embedding-based models, where embeddings are generated and then classified using supervised learning approaches. Our findings suggest that while LLMs show promise in certain aspects of CVSS evaluation, traditional embedding-based systems surprisingly perform better when assessing more subjective components, such as the evaluation of confidentiality, integrity, and availability impacts. These results underline the complexity of vulnerability scoring and emphasize the need for continued exploration of hybrid approaches that combine the strengths of both methods.

(back to top)

⚙️ Usage

To replicate our result or start using the LLMs and embedding models, start by cloning the repository.

git clone https://github.com/spritz-group/LLM-CVSS.git
cd LLM-CVSS

Then, install the required Python packages by running the following command. We recommend setting up a dedicated environment to run the experiments.

pip install -r requirements.txt

The llms.py script also includes OpenAI models. You can choose to not use them by toggling useOpenAI to false. If instead you want to use GPT models, you should add your own API key with the following command.

export OPENAI_API_KEY=<your openai key>

There is also the possibility to run models from OpenRouter. Since they can use the same OpenAI client, you should manually overwrite the API key in the llms.py script.

(back to top)

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
assets		assets
cves @ d51f124		cves @ d51f124
results		results
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
dataset.parquet		dataset.parquet
dataset.py		dataset.py
embeddings.py		embeddings.py
evaluation.ipynb		evaluation.ipynb
llms.py		llms.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Can LLMs Classify CVEs?

🧩 Abstract

⚙️ Usage

About

Uh oh!

Contributors 2

Uh oh!

Languages

spritz-group/LLM-CVSS

Folders and files

Latest commit

History

Repository files navigation

Can LLMs Classify CVEs?

🧩 Abstract

⚙️ Usage

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors 2

Uh oh!

Languages