AIセキュリティポータル K Program
Harnessing the Power of LLMs in Source Code Vulnerability Detection
Share
Abstract
Software vulnerabilities, caused by unintentional flaws in source code, are a primary root cause of cyberattacks. Static analysis of source code has been widely used to detect these unintentional defects introduced by software developers. Large Language Models (LLMs) have demonstrated human-like conversational abilities due to their capacity to capture complex patterns in sequential data, such as natural languages. In this paper, we harness LLMs' capabilities to analyze source code and detect known vulnerabilities. To ensure the proposed vulnerability detection method is universal across multiple programming languages, we convert source code to LLVM IR and train LLMs on these intermediate representations. We conduct extensive experiments on various LLM architectures and compare their accuracy. Our comprehensive experiments on real-world and synthetic codes from NVD and SARD demonstrate high accuracy in identifying source code vulnerabilities.
A Hierarchical Deep Neural Network for Detecting Lines of Codes with Vulnerabilities
A. Mahyari
Published: 2022
Vuldeelocator: a deep learning-based fine-grained vulnerability detector
Li, Z., Zou, D., Xu, S., Chen, Z., Zhu, Y., Jin, H.
Published: 2022
After equifax hack, calls for big changes in credit reporting industry
C. Arnold
Published: 2017
Are vulnerabilities discovered and resolved like other defects?
P. J Morrison, R. Pandita, X. Xiao, R. Chillarege, L. Williams
Published: 2018
Cve Common vulnerabilities and exposures
Published: 2018
Vuddy: A scalable approach for vulnerable code clone discovery
S. Kim, S. Woo, H. Lee, H. Oh
Published: 2017
Vulpecker: an automated vulnerability detection system based on code similarity analysis
Z. Li, D. Zou, S. Xu, H. Jin, H. Qi, J. Hu
Published: 2016
Leopard: Identifying vulnerable code for vulnerability assessment through program metrics
X. Du, B. Chen, Y. Li, J. Guo, Y. Zhou, Y. Liu, Y. Jiang
Published: 2019
Testing apps with real-world inputs
T. Wanwarang, N. P Borges Jr, L. Bettscheider, A. Zeller
Published: 2020
Toward large-scale vulnerability discovery using machine learning
Gustavo Grieco, Guillermo Luis Grinblat, Lucas C. Uzal, Sanjay Rawat, Josselin Feist, Laurent Mounier
Published: 2016
Chucky: Exposing missing checks in source code for vulnerability discovery
F. Yamaguchi, C. Wressnegger, H. Gascon, K. Rieck
Published: 2013
Generalized vulnerability extrapolation using abstract syntax trees
F. Yamaguchi, M. Lottmann, K. Rieck
Published: 2012
Sysevr: A framework for using deep learning to detect software vulnerabilities
Li, Z., Zou, D., Xu, S., Jin, H., Zhu, Y., Chen, Z.
Published: 2021
Nvd: National vulnerability database
Published: 2021
Sard: Software assurance reference dataset
Published: 2021
Llvm intermediate representation
Published: 2021
Bert: Pre-training of deep bidirectional transformers for language understanding
Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova
Published: 2019
Share