Repairing vulnerabilities without invisible hands. A differentiated replication study on LLMs

TOP Literature Database Repairing vulnerabilities without invisible hands. A differentiated replication study on LLMs

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2507.20977

PDF

https://arxiv.org/pdf/2507.20977

Paper Information

Author: Maria Camporese,Fabio Massacci
Published: 7-29-2025
Affiliation: University of Trento, IT
Country: Italy
Conference: Computing Research Repository (CoRR)

Labels Estimated by AI

Vulnerability Management Prompt Injection Large Language Model

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Background: Automated Vulnerability Repair (AVR) is a fast-growing branch of program repair. Recent studies show that large language models (LLMs) outperform traditional techniques, extending their success beyond code generation and fault detection. Hypothesis: These gains may be driven by hidden factors -- "invisible hands" such as training-data leakage or perfect fault localization -- that let an LLM reproduce human-authored fixes for the same code. Objective: We replicate prior AVR studies under controlled conditions by deliberately adding errors to the reported vulnerability location in the prompt. If LLMs merely regurgitate memorized fixes, both small and large localization errors should yield the same number of correct patches, because any offset should divert the model from the original fix. Method: Our pipeline repairs vulnerabilities from the Vul4J and VJTrans benchmarks after shifting the fault location by n lines from the ground truth. A first LLM generates a patch, a second LLM reviews it, and we validate the result with regression and proof-of-vulnerability tests. Finally, we manually audit a sample of patches and estimate the error rate with the Agresti-Coull-Wilson method.

External Datasets

Vul4J

VJBench

VJBench-trans