These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
The increasing complexity of software systems and the sophistication of
cyber-attacks have underscored the critical need for effective automated
vulnerability detection and repair systems. Traditional methods, such as static
program analysis, face significant challenges related to scalability,
adaptability, and high false-positive and false-negative rates. AI-driven
approaches, particularly those using machine learning and deep learning models,
show promise but are heavily reliant on the quality and quantity of training
data. This paper introduces a novel framework designed to automatically
introduce realistic, category-specific vulnerabilities into secure C/C++
codebases to generate datasets. The proposed approach coordinates multiple AI
agents that simulate expert reasoning, along with function agents and
traditional code analysis tools. It leverages Retrieval-Augmented Generation
for contextual grounding and employs Low-Rank approximation of weights for
efficient model fine-tuning. Our experimental study on 116 code samples from
three different benchmarks suggests that our approach outperforms other
techniques with regard to dataset accuracy, achieving between 89\% and 95\%
success rates in injecting vulnerabilities at function level.