APOLLO: A GPT-based tool to detect phishing emails and generate explanations that warn users

TOP Literature Database APOLLO: A GPT-based tool to detect phishing emails and generate explanations that warn users

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2410.07997

PDF

https://arxiv.org/pdf/2410.07997

Paper Information

Author: Giuseppe Desolda;Francesco Greco;Luca Viganò
Published: 10-10-2024
Affiliation: University of Bari “A. Moro”
Country: Italy
Conference: Computing Research Repository (CoRR)

Labels Estimated by AI

Phishing Detection Prompt Injection User Education

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Phishing is one of the most prolific cybercriminal activities, with attacks becoming increasingly sophisticated. It is, therefore, imperative to explore novel technologies to improve user protection across both technical and human dimensions. Large Language Models (LLMs) offer significant promise for text processing in various domains, but their use for defense against phishing attacks still remains scarcely explored. In this paper, we present APOLLO, a tool based on OpenAI's GPT-4o to detect phishing emails and generate explanation messages to users about why a specific email is dangerous, thus improving their decision-making capabilities. We have evaluated the performance of APOLLO in classifying phishing emails; the results show that the LLM models have exemplary capabilities in classifying phishing emails (97 percent accuracy in the case of GPT-4o) and that this performance can be further improved by integrating data from third-party services, resulting in a near-perfect classification rate (99 percent accuracy). To assess the perception of the explanations generated by this tool, we also conducted a study with 20 participants, comparing four different explanations presented as phishing warnings. We compared the LLM-generated explanations to four baselines: a manually crafted warning, and warnings from Chrome, Firefox, and Edge browsers. The results show that not only the LLM-generated explanations were perceived as high quality, but also that they can be more understandable, interesting, and trustworthy than the baselines. These findings suggest that using LLMs as a defense against phishing is a very promising approach, with APOLLO representing a proof of concept in this research direction.