These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Voice phishing (vishing) remains a persistent threat in cybersecurity,
exploiting human trust through persuasive speech. While machine learning
(ML)-based classifiers have shown promise in detecting malicious call
transcripts, they remain vulnerable to adversarial manipulations that preserve
semantic content. In this study, we explore a novel attack vector where large
language models (LLMs) are leveraged to generate adversarial vishing
transcripts that evade detection while maintaining deceptive intent. We
construct a systematic attack pipeline that employs prompt engineering and
semantic obfuscation to transform real-world vishing scripts using four
commercial LLMs. The generated transcripts are evaluated against multiple ML
classifiers trained on a real-world Korean vishing dataset (KorCCViD) with
statistical testing. Our experiments reveal that LLM-generated transcripts are
both practically and statistically effective against ML-based classifiers. In
particular, transcripts crafted by GPT-4o significantly reduce classifier
accuracy (by up to 30.96%) while maintaining high semantic similarity, as
measured by BERTScore. Moreover, these attacks are both time-efficient and
cost-effective, with average generation times under 9 seconds and negligible
financial cost per query. The results underscore the pressing need for more
resilient vishing detection frameworks and highlight the imperative for LLM
providers to enforce stronger safeguards against prompt misuse in adversarial
social engineering contexts.