These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
The short message service (SMS) was introduced a generation ago to the mobile
phone users. They make up the world's oldest large-scale network, with billions
of users and therefore attracts a lot of fraud. Due to the convergence of
mobile network with internet, SMS based scams can potentially compromise the
security of internet services as well. In this study, we present a new SMS scam
dataset consisting of 153,551 SMSes. This dataset that we will release publicly
for research purposes represents the largest publicly-available SMS scam
dataset. We evaluate and compare the performance achieved by several
established machine learning methods on the new dataset, ranging from shallow
machine learning approaches to deep neural networks to syntactic and semantic
feature models. We then study the existing models from an adversarial viewpoint
by assessing its robustness against different level of adversarial
manipulation. This perspective consolidates the current state of the art in SMS
Spam filtering, highlights the limitations and the opportunities to improve the
existing approaches.