Using LLMs for Security Advisory Investigations: How Far Are We?

TOP 文献データベース Using LLMs for Security Advisory Investigations: How Far Are We?

Computing Research Repository (CoRR)

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2506.13161

PDF

https://arxiv.org/pdf/2506.13161

文献情報

作者: Bayu Fedra Abdullah,Yusuf Sulistyo Nugroho,Brittany Reid,Raula Gaikovina Kula,Kazumasa Shimari,Kenichi Matsumoto
公開日: 2025-6-18
所属機関: Informatics Engineering, Universitas Muhammadiyah Surakarta
所属の国: Indonesia
会議名: Computing Research Repository (CoRR)

AIにより推定されたラベル

ハルシネーションアドバイス提供プロンプトリーキング

Abstract

Large Language Models (LLMs) are increasingly used in software security, but their trustworthiness in generating accurate vulnerability advisories remains uncertain. This study investigates the ability of ChatGPT to (1) generate plausible security advisories from CVE-IDs, (2) differentiate real from fake CVE-IDs, and (3) extract CVE-IDs from advisory descriptions. Using a curated dataset of 100 real and 100 fake CVE-IDs, we manually analyzed the credibility and consistency of the model's outputs. The results show that ChatGPT generated plausible security advisories for 96% of given input real CVE-IDs and 97% of given input fake CVE-IDs, demonstrating a limitation in differentiating between real and fake IDs. Furthermore, when these generated advisories were reintroduced to ChatGPT to identify their original CVE-ID, the model produced a fake CVE-ID in 6% of cases from real advisories. These findings highlight both the strengths and limitations of ChatGPT in cybersecurity applications. While the model demonstrates potential for automating advisory generation, its inability to reliably authenticate CVE-IDs or maintain consistency upon re-evaluation underscores the risks associated with its deployment in critical security tasks. Our study emphasizes the importance of using LLMs with caution in cybersecurity workflows and suggests the need for further improvements in their design to improve reliability and applicability in security advisory generation.