These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
The Internet is used by billions of users every day because it offers fast
and free communication tools and platforms. Nevertheless, with this significant
increase in usage, huge amounts of spam are generated every second, which
wastes internet resources and, more importantly, users' time. This study
investigates the use of machine learning models to classify URLs as spam or
nonspam. We first extract the features from the URL as it has only one feature,
and then we compare the performance of several models, including k nearest
neighbors, bagging, random forest, logistic regression, and others.
Experimental results demonstrate that bagging outperformed other models and
achieved the highest accuracy of 98.64%. In addition, bagging outperformed the
current state-of-the-art approaches which emphasize its effectiveness in
addressing spam-related challenges on the Internet. This suggests that bagging
is a promising approach for URL spam classification.