LLMxCPG: Context-Aware Vulnerability Detection Through Code Property Graph-Guided Large Language Models

TOP Literature Database LLMxCPG: Context-Aware Vulnerability Detection Through Code Property Graph-Guided Large Language Models

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2507.16585

PDF

https://arxiv.org/pdf/2507.16585

Paper Information

Author: Ahmed Lekssays,Hamza Mouhcine,Khang Tran,Ting Yu,Issa Khalil
Published: 7-22-2025
Affiliation: Qatar Computing Research Institute
Country: Qatar
Conference: USENIX Security Symposium

Labels Estimated by AI

Prompt leaking 脆弱性検出手法(Fail to translate) Dataset Analysis

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Software vulnerabilities present a persistent security challenge, with over 25,000 new vulnerabilities reported in the Common Vulnerabilities and Exposures (CVE) database in 2024 alone. While deep learning based approaches show promise for vulnerability detection, recent studies reveal critical limitations in terms of accuracy and robustness: accuracy drops by up to 45% on rigorously verified datasets, and performance degrades significantly under simple code modifications. This paper presents LLMxCPG, a novel framework integrating Code Property Graphs (CPG) with Large Language Models (LLM) for robust vulnerability detection. Our CPG-based slice construction technique reduces code size by 67.84 to 90.93% while preserving vulnerability-relevant context. Our approach's ability to provide a more concise and accurate representation of code snippets enables the analysis of larger code segments, including entire projects. This concise representation is a key factor behind the improved detection capabilities of our method, as it can now identify vulnerabilities that span multiple functions. Empirical evaluation demonstrates LLMxCPG's effectiveness across verified datasets, achieving 15-40% improvements in F1-score over state-of-the-art baselines. Moreover, LLMxCPG maintains high performance across function-level and multi-function codebases while exhibiting robust detection efficacy under various syntactic code modifications.

External Datasets

FormAI-v2

PrimeVul

SVEN

ReposVul