These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Vulnerability Detection (VD) using machine learning faces a significant
challenge: the vast diversity of vulnerability types. Each Common Weakness
Enumeration (CWE) represents a unique category of vulnerabilities with distinct
characteristics, code semantics, and patterns. Treating all vulnerabilities as
a single label with a binary classification approach may oversimplify the
problem, as it fails to capture the nuances and context-specific to each CWE.
As a result, a single binary classifier might merely rely on superficial text
patterns rather than understanding the intricacies of each vulnerability type.
Recent reports showed that even the state-of-the-art Large Language Model (LLM)
with hundreds of billions of parameters struggles to generalize well to detect
vulnerabilities. Our work investigates a different approach that leverages
CWE-specific classifiers to address the heterogeneity of vulnerability types.
We hypothesize that training separate classifiers for each CWE will enable the
models to capture the unique characteristics and code semantics associated with
each vulnerability category. To confirm this, we conduct an ablation study by
training individual classifiers for each CWE and evaluating their performance
independently. Our results demonstrate that CWE-specific classifiers outperform
a single binary classifier trained on all vulnerabilities. Building upon this,
we explore strategies to combine them into a unified vulnerability detection
system using a multiclass approach. Even if the lack of large and high-quality
datasets for vulnerability detection is still a major obstacle, our results
show that multiclass detection can be a better path toward practical
vulnerability detection in the future. All our models and code to produce our
results are open-sourced.