These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Intrusion Detection and Prevention Systems (IDS/IPS) in large enterprises can
generate hundreds of thousands of alerts per hour, overwhelming analysts with
logs requiring rapidly evolving expertise. Conventional machine-learning
detectors reduce alert volume but still yield many false positives, while
standard Retrieval-Augmented Generation (RAG) pipelines often retrieve
irrelevant context and fail to justify predictions. We present CyberRAG, a
modular agent-based RAG framework that delivers real-time classification,
explanation, and structured reporting for cyber-attacks. A central LLM agent
orchestrates: (i) fine-tuned classifiers specialized by attack family; (ii)
tool adapters for enrichment and alerting; and (iii) an iterative
retrieval-and-reason loop that queries a domain-specific knowledge base until
evidence is relevant and self-consistent. Unlike traditional RAG, CyberRAG
adopts an agentic design that enables dynamic control flow and adaptive
reasoning. This architecture autonomously refines threat labels and
natural-language justifications, reducing false positives and enhancing
interpretability. It is also extensible: new attack types can be supported by
adding classifiers without retraining the core agent. CyberRAG was evaluated on
SQL Injection, XSS, and SSTI, achieving over 94\% accuracy per class and a
final classification accuracy of 94.92\% through semantic orchestration.
Generated explanations reached 0.94 in BERTScore and 4.9/5 in GPT-4-based
expert evaluation, with robustness preserved against adversarial and unseen
payloads. These results show that agentic, specialist-oriented RAG can combine
high detection accuracy with trustworthy, SOC-ready prose, offering a flexible
path toward partially automated cyber-defense workflows.