These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Scammers are increasingly harnessing generative AI(GenAI) technologies to
produce convincing phishing content at scale, amplifying financial fraud and
undermining public trust. While conventional defenses, such as detection
algorithms, user training, and reactive takedown efforts remain important, they
often fall short in dismantling the infrastructure scammers depend on,
including mule bank accounts and cryptocurrency wallets. To bridge this gap, a
proactive and emerging strategy involves using conversational honeypots to
engage scammers and extract actionable threat intelligence. This paper presents
the first large-scale, real-world evaluation of a scambaiting system powered by
large language models (LLMs). Over a five-month deployment, the system
initiated over 2,600 engagements with actual scammers, resulting in a dataset
of more than 18,700 messages. It achieved an Information Disclosure Rate (IDR)
of approximately 32%, successfully extracting sensitive financial information
such as mule accounts. Additionally, the system maintained a Human Acceptance
Rate (HAR) of around 70%, indicating strong alignment between LLM-generated
responses and human operator preferences. Alongside these successes, our
analysis reveals key operational challenges. In particular, the system
struggled with engagement takeoff: only 48.7% of scammers responded to the
initial seed message sent by defenders. These findings highlight the need for
further refinement and provide actionable insights for advancing the design of
automated scambaiting systems.
External Datasets
18,797 messages across 2,638 attacker-defender engagements