PhishIntentionLLM: Uncovering Phishing Website Intentions through Multi-Agent Retrieval-Augmented Generation

TOP Literature Database PhishIntentionLLM: Uncovering Phishing Website Intentions through Multi-Agent Retrieval-Augmented Generation

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2507.15419

PDF

https://arxiv.org/pdf/2507.15419

Paper Information

Author: Wenhao Li,Selvakumar Manickam,Yung-wey Chong,Shankar Karuppayah
Published: 7-21-2025
Affiliation: Cybersecurity Research Centre, Universiti Sains Malaysia
Country: Malaysia
Conference

Labels Estimated by AI

Poisoning attack on RAG フィッシング攻撃の意図(Fail to translate) Prompt leaking

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Phishing websites remain a major cybersecurity threat, yet existing methods primarily focus on detection, while the recognition of underlying malicious intentions remains largely unexplored. To address this gap, we propose PhishIntentionLLM, a multi-agent retrieval-augmented generation (RAG) framework that uncovers phishing intentions from website screenshots. Leveraging the visual-language capabilities of large language models (LLMs), our framework identifies four key phishing objectives: Credential Theft, Financial Fraud, Malware Distribution, and Personal Information Harvesting. We construct and release the first phishing intention ground truth dataset (~2K samples) and evaluate the framework using four commercial LLMs. Experimental results show that PhishIntentionLLM achieves a micro-precision of 0.7895 with GPT-4o and significantly outperforms the single-agent baseline with a ~95% improvement in micro-precision. Compared to the previous work, it achieves 0.8545 precision for credential theft, marking a ~4% improvement. Additionally, we generate a larger dataset of ~9K samples for large-scale phishing intention profiling across sectors. This work provides a scalable and interpretable solution for intention-aware phishing analysis.

External Datasets

phishing intention ground truth dataset

larger-scale phishing intention dataset