These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Malicious URL, a.k.a. malicious website, is a common and serious threat to
cybersecurity. Malicious URLs host unsolicited content (spam, phishing,
drive-by exploits, etc.) and lure unsuspecting users to become victims of scams
(monetary loss, theft of private information, and malware installation), and
cause losses of billions of dollars every year. It is imperative to detect and
act on such threats in a timely manner. Traditionally, this detection is done
mostly through the usage of blacklists. However, blacklists cannot be
exhaustive, and lack the ability to detect newly generated malicious URLs. To
improve the generality of malicious URL detectors, machine learning techniques
have been explored with increasing attention in recent years. This article aims
to provide a comprehensive survey and a structural understanding of Malicious
URL Detection techniques using machine learning. We present the formal
formulation of Malicious URL Detection as a machine learning task, and
categorize and review the contributions of literature studies that addresses
different dimensions of this problem (feature representation, algorithm design,
etc.). Further, this article provides a timely and comprehensive survey for a
range of different audiences, not only for machine learning researchers and
engineers in academia, but also for professionals and practitioners in
cybersecurity industry, to help them understand the state of the art and
facilitate their own research and practical applications. We also discuss
practical issues in system design, open research challenges, and point out some
important directions for future research.