Harnessing the Power of LLMs in Source Code Vulnerability Detection

TOP Literature Database Harnessing the Power of LLMs in Source Code Vulnerability Detection

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2408.03489

PDF

https://arxiv.org/pdf/2408.03489

Paper Information

Author: Andrew A Mahyari
Published: 8-7-2024
Affiliation: AIVault Inc.
Country: United States of America
Conference: IEEE Military Communications Conference (MILCOM)

Labels Estimated by AI

Vulnerability Management Program Analysis LLM Performance Evaluation

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Software vulnerabilities, caused by unintentional flaws in source code, are a primary root cause of cyberattacks. Static analysis of source code has been widely used to detect these unintentional defects introduced by software developers. Large Language Models (LLMs) have demonstrated human-like conversational abilities due to their capacity to capture complex patterns in sequential data, such as natural languages. In this paper, we harness LLMs' capabilities to analyze source code and detect known vulnerabilities. To ensure the proposed vulnerability detection method is universal across multiple programming languages, we convert source code to LLVM IR and train LLMs on these intermediate representations. We conduct extensive experiments on various LLM architectures and compare their accuracy. Our comprehensive experiments on real-world and synthetic codes from NVD and SARD demonstrate high accuracy in identifying source code vulnerabilities.

External Datasets

NVD

SARD