How Far Have We Gone in Vulnerability Detection Using Large Language Models

TOP 文献データベース How Far Have We Gone in Vulnerability Detection Using Large Language Models

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2311.12420

PDF

https://arxiv.org/pdf/2311.12420

文献情報

作者: Zeyu Gao;Hao Wang;Yuchen Zhou;Wenyu Zhu;Chao Zhang
公開日: 2023-11-21
更新日: 2023-12-22
所属機関: Tsinghua University
所属の国: China
会議名: Computing Research Repository (CoRR)

AIにより推定されたラベル

脆弱性検出コード変更分析評価手法

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

As software becomes increasingly complex and prone to vulnerabilities, automated vulnerability detection is critically important, yet challenging. Given the significant successes of large language models (LLMs) in various tasks, there is growing anticipation of their efficacy in vulnerability detection. However, a quantitative understanding of their potential in vulnerability detection is still missing. To bridge this gap, we introduce a comprehensive vulnerability benchmark VulBench. This benchmark aggregates high-quality data from a wide range of CTF (Capture-the-Flag) challenges and real-world applications, with annotations for each vulnerable function detailing the vulnerability type and its root cause. Through our experiments encompassing 16 LLMs and 6 state-of-the-art (SOTA) deep learning-based models and static analyzers, we find that several LLMs outperform traditional deep learning approaches in vulnerability detection, revealing an untapped potential in LLMs. This work contributes to the understanding and utilization of LLMs for enhanced software security.

外部データセット

VulBench

MAGMA

Devign

D2A

Big-Vul