Large Language Models for Secure Code Assessment: A Multi-Language Empirical Study

TOP Literature Database Large Language Models for Secure Code Assessment: A Multi-Language Empirical Study

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2408.06428

PDF

https://arxiv.org/pdf/2408.06428

Paper Information

Author: Kohei Dozono;Tiago Espinha Gasiba;Andrea Stocco
Published: 8-13-2024
Affiliation: Technical University of Munich
Country: Germany
Conference: Computing Research Repository (CoRR)

Labels Estimated by AI

LLM Performance Evaluation Prompt Injection Vulnerability Management

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Most vulnerability detection studies focus on datasets of vulnerabilities in C/C++ code, offering limited language diversity. Thus, the effectiveness of deep learning methods, including large language models (LLMs), in detecting software vulnerabilities beyond these languages is still largely unexplored. In this paper, we evaluate the effectiveness of LLMs in detecting and classifying Common Weakness Enumerations (CWE) using different prompt and role strategies. Our experimental study targets six state-of-the-art pre-trained LLMs (GPT-3.5- Turbo, GPT-4 Turbo, GPT-4o, CodeLLama-7B, CodeLLama- 13B, and Gemini 1.5 Pro) and five programming languages: Python, C, C++, Java, and JavaScript. We compiled a multi-language vulnerability dataset from different sources, to ensure representativeness. Our results showed that GPT-4o achieves the highest vulnerability detection and CWE classification scores using a few-shot setting. Aside from the quantitative results of our study, we developed a library called CODEGUARDIAN integrated with VSCode which enables developers to perform LLM-assisted real-time vulnerability analysis in real-world security scenarios. We have evaluated CODEGUARDIAN with a user study involving 22 developers from the industry. Our study showed that, by using CODEGUARDIAN, developers are more accurate and faster at detecting vulnerabilities.

External Datasets

CVEFixes

CWE-snippets

JavaScript Vulnerability DataSet (JVD)