These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Background: The C and C++ languages hold significant importance in Software
Engineering research because of their widespread use in practice. Numerous
studies have utilized Machine Learning (ML) and Deep Learning (DL) techniques
to detect software vulnerabilities (SVs) in the source code written in these
languages. However, the application of these techniques in function-level SV
assessment has been largely unexplored. SV assessment is increasingly crucial
as it provides detailed information on the exploitability, impacts, and
severity of security defects, thereby aiding in their prioritization and
remediation. Aims: We conduct the first empirical study to investigate and
compare the performance of ML and DL models, many of which have been used for
SV detection, for function-level SV assessment in C/C++. Method: Using 9,993
vulnerable C/C++ functions, we evaluated the performance of six multi-class ML
models and five multi-class DL models for the SV assessment at the function
level based on the Common Vulnerability Scoring System (CVSS). We further
explore multi-task learning, which can leverage common vulnerable code to
predict all SV assessment outputs simultaneously in a single model, and compare
the effectiveness and efficiency of this model type with those of the original
multi-class models. Results: We show that ML has matching or even better
performance compared to the multi-class DL models for function-level SV
assessment with significantly less training time. Employing multi-task learning
allows the DL models to perform significantly better, with an average of 8-22%
increase in Matthews Correlation Coefficient (MCC). Conclusions: We distill the
practices of using data-driven techniques for function-level SV assessment in
C/C++, including the use of multi-task DL to balance efficiency and
effectiveness. This can establish a strong foundation for future work in this
area.