An Investigation of Large Language Models and Their Vulnerabilities in Spam Detection

TOP Literature Database An Investigation of Large Language Models and Their Vulnerabilities in Spam Detection

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2504.09776

PDF

https://arxiv.org/pdf/2504.09776

Paper Information

Author: Qiyao Tang,Xiangyang Li
Published: 4-14-2025
Affiliation: Johns Hopkins University
Country: United States of America
Conference

Labels Estimated by AI

Model DoS LLM Performance Evaluation Prompt Injection

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Spam messages continue to present significant challenges to digital users, cluttering inboxes and posing security risks. Traditional spam detection methods, including rules-based, collaborative, and machine learning approaches, struggle to keep up with the rapidly evolving tactics employed by spammers. This project studies new spam detection systems that leverage Large Language Models (LLMs) fine-tuned with spam datasets. More importantly, we want to understand how LLM-based spam detection systems perform under adversarial attacks that purposefully modify spam emails and data poisoning attacks that exploit the differences between the training data and the massages in detection, to which traditional machine learning models are shown to be vulnerable. This experimentation employs two LLM models of GPT2 and BERT and three spam datasets of Enron, LingSpam, and SMSspamCollection for extensive training and testing tasks. The results show that, while they can function as effective spam filters, the LLM models are susceptible to the adversarial and data poisoning attacks. This research provides very useful insights for future applications of LLM models for information security.

External Datasets

Enron

LingSpam

SMSspamCollection