The emergence of online services in our daily lives has been accompanied by a
range of malicious attempts to trick individuals into performing undesired
actions, often to the benefit of the adversary. The most popular medium of
these attempts is phishing attacks, particularly through emails and websites.
In order to defend against such attacks, there is an urgent need for automated
mechanisms to identify this malevolent content before it reaches users. Machine
learning techniques have gradually become the standard for such classification
problems. However, identifying common measurable features of phishing content
(e.g., in emails) is notoriously difficult. To address this problem, we engage
in a novel study into a phishing content classifier based on a recurrent neural
network (RNN), which identifies such features without human input. At this
stage, we scope our research to emails, but our approach can be extended to
apply to websites. Our results show that the proposed system outperforms
state-of-the-art tools. Furthermore, our classifier is efficient and takes into
account only the text and, in particular, the textual structure of the email.
Since these features are rarely considered in email classification, we argue
that our classifier can complement existing classifiers with high information
gain.