Exploring the Limits of ChatGPT in Software Security Applications

TOP Literature Database Exploring the Limits of ChatGPT in Software Security Applications

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2312.05275

PDF

https://arxiv.org/pdf/2312.05275

Paper Information

Author: Fangzhou Wu;Qingzhao Zhang;Ati Priya Bajaj;Tiffany Bao;Ning Zhang;Ruoyu "Fish" Wang;Chaowei Xiao
Published: 12-8-2023
Affiliation: University of Wisconsin, Madison
Country: United States of America
Conference: Computing Research Repository (CoRR)

Labels Estimated by AI

Program Analysis Prompt Injection Vulnerability Management

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Large language models (LLMs) have undergone rapid evolution and achieved remarkable results in recent times. OpenAI's ChatGPT, backed by GPT-3.5 or GPT-4, has gained instant popularity due to its strong capability across a wide range of tasks, including natural language tasks, coding, mathematics, and engaging conversations. However, the impacts and limits of such LLMs in system security domain are less explored. In this paper, we delve into the limits of LLMs (i.e., ChatGPT) in seven software security applications including vulnerability detection/repair, debugging, debloating, decompilation, patching, root cause analysis, symbolic execution, and fuzzing. Our exploration reveals that ChatGPT not only excels at generating code, which is the conventional application of language models, but also demonstrates strong capability in understanding user-provided commands in natural languages, reasoning about control and data flows within programs, generating complex data structures, and even decompiling assembly code. Notably, GPT-4 showcases significant improvements over GPT-3.5 in most security tasks. Also, certain limitations of ChatGPT in security-related tasks are identified, such as its constrained ability to process long code contexts.

External Datasets

SARD dataset

Common Vulnerabilities and Exposures (CVE)

QuixBugs

LeetCode