These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Phishing emails are a critical component of the cybercrime kill chain due to
their wide reach and low cost. Their ever-evolving nature renders traditional
rule-based and feature-engineered detectors ineffective in the ongoing arms
race between attackers and defenders. The rise of large language models (LLMs)
further exacerbates the threat, enabling attackers to craft highly convincing
phishing emails at minimal cost.
This work demonstrates that LLMs can generate psychologically persuasive
phishing emails tailored to victim profiles, successfully bypassing nearly all
commercial and academic detectors. To defend against such threats, we propose
PiMRef, the first reference-based phishing email detector that leverages
knowledge-based invariants. Our core insight is that persuasive phishing emails
often contain disprovable identity claims, which contradict real-world facts.
PiMRef reframes phishing detection as an identity fact-checking task. Given an
email, PiMRef (i) extracts the sender's claimed identity, (ii) verifies the
legitimacy of the sender's domain against a predefined knowledge base, and
(iii) detects call-to-action prompts that push user engagement. Contradictory
claims are flagged as phishing indicators and serve as human-understandable
explanations.
Compared to existing methods such as D-Fence, HelpHed, and ChatSpamDetector,
PiMRef boosts precision by 8.8% with no loss in recall on standard benchmarks
like Nazario and PhishPot. In a real-world evaluation of 10,183 emails across
five university accounts over three years, PiMRef achieved 92.1% precision,
87.9% recall, and a median runtime of 0.05s, outperforming the state-of-the-art
in both effectiveness and efficiency.