These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Phishing kits are tools that dark side experts provide to the community of
criminal phishers to facilitate the construction of malicious Web sites. As
these kits evolve in sophistication, providers of Web-based services need to
keep pace with continuous complexity. We present an original classification of
a corpus of over 2000 recent phishing kits according to their adopted evasion
and obfuscation functions. We carry out an initial deterministic analysis of
the source code of the kits to extract the most discriminant features and
information about their principal authors. We then integrate this initial
classification through supervised machine learning models. Thanks to the
ground-truth achieved in the first step, we can demonstrate whether and which
machine learning models are able to suitably classify even the kits adopting
novel evasion and obfuscation techniques that were unseen during the training
phase. We compare different algorithms and evaluate their robustness in the
realistic case in which only a small number of phishing kits are available for
training. This paper represents an initial but important step to support Web
service providers and analysts in improving early detection mechanisms and
intelligence operations for the phishing kits that might be installed on their
platforms.