figshare
Browse

VulnSage.zip

Download (145.23 MB)
dataset
posted on 2025-03-15, 05:01 authored by Arastoo ZibaeiradArastoo Zibaeirad

Automating software vulnerability detection (SVD) remains

a critical challenge in an era of increasingly complex and in-

terdependent software systems. Despite significant advances

in Large Language Models (LLMs) for code analysis, prevail-

ing evaluation methodologies often lack the context-aware

robustness necessary to capture real-world intricacies and

cross-component interactions. To address these limitations,

we present VulnSage, a comprehensive evaluation framework

and a dataset curated from diverse, large-scale open-source

system software projects developed in C/C++. Unlike prior

datasets, VulnSage leverages a heuristic noise pre-filtering

approach combined with LLM-based reasoning to ensure a

representative and minimally noisy spectrum of vulnerabili-

ties. The framework can be used to rigorously assess LLMs

by supporting a multi-granular analysis across function, file,

and inter-function levels and employing four diverse zero-

shot prompt strategies: Baseline, Chain-of-Thought, Think,

and Think & Verify.

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC