These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Software vulnerabilities continue to be ubiquitous, even in the era of
AI-powered code assistants, advanced static analysis tools, and the adoption of
extensive testing frameworks. It has become apparent that we must not simply
prevent these bugs, but also eliminate them in a quick, efficient manner. Yet,
human code intervention is slow, costly, and can often lead to further security
vulnerabilities, especially in legacy codebases. The advent of highly advanced
Large Language Models (LLM) has opened up the possibility for many software
defects to be patched automatically. We propose LLM4CVE an LLM-based iterative
pipeline that robustly fixes vulnerable functions in real-world code with high
accuracy. We examine our pipeline with State-of-the-Art LLMs, such as GPT-3.5,
GPT-4o, Llama 38B, and Llama 3 70B. We achieve a human-verified quality score
of 8.51/10 and an increase in groundtruth code similarity of 20% with Llama 3
70B. To promote further research in the area of LLM-based vulnerability repair,
we publish our testing apparatus, fine-tuned weights, and experimental data on
our website