Conflicting Scores, Confusing Signals: An Empirical Study of Vulnerability Scoring Systems
This artifact package contains data, scripts, and figures associated with the empirical study presented in the paper. It supports the reproducibility of all key results, including correlation analysis, agreement metrics, exploit prediction performance, t-SNE visualizations, and scoring system evaluations.
Contains correlation data and visualizations:
- pearson_correlation_values.csv, spearman_correlation_values.csv, kendall_correlation_values.csv: Correlation coefficient matrices.
- pearson_pvalues.csv, spearman_pvalues.csv, kendall_pvalues.csv: Correlation p-value matrices.
- Corresponding heatmaps.
- Generated by correlations.py
Includes EPSS-KEV analysis data:
- epss_scores-YYYY-MM-DD.csv: EPSS score snapshots used for temporal analysis.
- results-12-04-2024-1509.csv: Raw EPSS-KEV data output (generated by kev-epss-compare.py).
- results-all-2021-2024.xlsx: Analyzed EPSS-KEV data.
- kev-epss-compare.py: Program that conducts EPSS-KEV analysis.
Figures used in the paper.
- Agreement_Scores.csv: Cohen’s Kappa and Krippendorff’s Alpha for all system pairs. Used in Table 3.
- Agreement_Scores_with_CWE.csv: Same as above but extended with per-CWE values. Used in Table 7.
- alpha_heatmap.png: Heatmap visualization of Krippendorff’s Alpha.
- kappa_heatmap.png: Heatmap visualization of Cohen’s Kappa.
- bin-analysis.xlsx: Bin-based analysis of CVE score distributions across scoring systems. Used in Figure 4.
- CVE_Ranking_Overlap_Stats.csv: Top-N overlap counts between scoring systems. Used in Figure 5.
- CVE_Rankings.csv: Full ranking of all CVEs per scoring system.
- known_exploited_vulnerabilities.csv: CISA KEV catalog used for identifying known exploited vulnerabilities.
- normalized-data.csv: Normalized score values used for t-SNE and heatmaps.
- PT-all-scores-all-approaches.csv: Raw score values from all scoring systems for all CVEs.
- PT-all-scores-all-approaches-with-kev.csv: Same as above, with an additional column indicating KEV membership. Used in Table 5.
- CohenKrippendorf.py: Computes Cohen’s Kappa and Krippendorff’s Alpha and generates corresponding spreadsheets and heatmaps.
- correlations.py: Computes Pearson, Spearman, and Kendall correlation matrices and generates correlations/.
- heatmap.py: Generates the heatmap of normalized scores across CVEs and scoring systems. Used in Figure 2.
- IsKEV.py: Cross-references CVEs with the KEV catalog.
- topN.py: Calculates CVE rankings per scoring system and overlap metrics. Generates CVE_Rankings.csv, CVE_Ranking_Overlap_Stats.csv, and the Figure 5 visualization.
- TSNE.py: Performs t-SNE on normalized scores and generates the Figure 1 visualization as well as normalized-data.csv.
- TSNE-CWE.py: Generates t-SNE visualizations for top CWEs. Used in Figure 6.
To reproduce specific components of the paper:
Analysis | Run This Script | Output |
---|---|---|
t-SNE Visualization (Fig 1) | TSNE.py | tsne_by_score_agreement_with_epss.png |
Normalized Score Heatmap (Fig 2) | heatmap.py | heatmap_normalized_sorted.png |
Correlation Analysis | correlations.py | CSVs and plots in correlations/ |
Agreement Scores (Tables 3 & 7) | CohenKrippendorf.py | Agreement_Scores.csv, Agreement_Scores_with_CWE.csv, heatmaps |
Top-N Ranking Analysis (Fig 5) | topN.py | Ranking CSVs and overlap plots |
EPSS vs. Exploitation Analysis (Table 6) | Uses IsKEV.py and epss-analysis/ | Processed in results-all-2021-2024.xlsx |
CWE-based t-SNE (Fig 6) | TSNE-CWE.py | tsne_cwe_agreement.png |