Multimodal Large Language Models for Phishing Webpage Detection and Identification

Phishing attacks trick victims into disclosing sensitive information. To counter rapidly evolving attacks, we must explore machine learning and deep learning models leveraging large-scale data. We discuss models built on different kinds of data, along with their advantages and disadvantages, and present multiple deployment options to detect phishing attacks.

フィッシング検出ウェブページコンテンツ分析サイバー攻撃

Network and Distributed System Security Symposium (NDSS)

Large-Scale Automatic Classification of Phishing Pages

C. Whittaker, B. Ryner, M. Nazif

Published: 2010

ACM Transactions on Information and System Security

CANTINA+: A Feature-Rich Machine Learning Framework for Detecting Phishing Web Sites

G. Xiang, J. Hong, C. P. Rose, L. Cranor

Published: 2011

Future Generation Computer Systems

A stacking model using URL and HTML features for phishing webpage detection

Y. Li, Z. Yang, X. Chen, H. Yuan, W. Liu

Published: 2019

NDSS MADWeb

Building robust phishing detection system: an empirical analysis

J. Lee, P. Ye, R. Liu, D. M. Divakaran, M. C. Chan

Published: 2020

ACM Intl. World Wide Web Conference (TheWebConf)

Phishing vs. legit: Comparative analysis of client-side resources of phishing and target brand websites

K. Lim, J. Park, D. Kim

Published: 2024

CCS

Visualphishnet: Zero-day phishing website detection by visual similarity

Sahar Abdelnabi, Katharina Krombholz, Mario Fritz

Published: 2020

Proc. USENIX Security Symposium

Phishpedia: A Hybrid Deep Learning Based Approach to Visually Identify Phishing Webpages

Y. Lin, R. Liu, D. M. Divakaran, J. Y. Ng, Q. Z. Chan, Y. Lu, Y. Si, F. Zhang, J. S. Dong

Published: 2021

USENIX Security Symposium

Inferring Phishing Intention via Webpage Appearance and Dynamics: A Deep Vision Based Approach

R. Liu, Y. Lin, X. Yang, S. H. Ng, D. M. Divakaran, J. S. Dong

Published: 2022

Proc. PAM

LogoMotive: detecting logos on websites to identify online scams - a TLD case study

T. v. d. Hout, T. Wabeke, G. C. M. Moura, C. Hesselman

Published: 2022

arxiv

被引用数 1

USENIX Security Symposium

KnowPhish: Large Language Models Meet Multimodal Knowledge Graphs for Enhancing Reference-Based Phishing Detection

Yuexin Li, Chengyu Huang, Shumin Deng, Mei Lin Lock, Tri Cao, Nay Oo, Hoon Wei Lim, Bryan Hooi

Published: 2024.3.5

Phishing attacks have inflicted substantial losses on individuals and businesses alike, necessitating the development of robust and efficient automated phishing detection approaches. Reference-based phishing detectors (RBPDs), which compare the logos on a target webpage to a known set of logos, have emerged as the state-of-the-art approach. However, a major limitation of existing RBPDs is that they rely on a manually constructed brand knowledge base, making it infeasible to scale to a large number of brands, which results in false negative errors due to the insufficient brand coverage of the knowledge base. To address this issue, we propose an automated knowledge collection pipeline, using which we collect a large-scale multimodal brand knowledge base, KnowPhish, containing 20k brands with rich information about each brand. KnowPhish can be used to boost the performance of existing RBPDs in a plug-and-play manner. A second limitation of existing RBPDs is that they solely rely on the image modality, ignoring useful textual information present in the webpage HTML. To utilize this textual information, we propose a Large Language Model (LLM)-based approach to extract brand information of webpages from text. Our resulting multimodal phishing detection approach, KnowPhish Detector (KPD), can detect phishing webpages with or without logos. We evaluate KnowPhish and KPD on a manually validated dataset, and a field study under Singapore's local context, showing substantial improvements in effectiveness and efficiency compared to state-of-the-art baselines.

プロンプトインジェクションフィッシング検出ブランド認識問題

arXiv preprint

Evaluating the effectiveness and robustness of visual similarity-based phishing detection models

F. Ji, K. Lee, H. Koo, W. You, E. Choo, H. Kim, D. Kim

Published: 2024

IEEE transactions on dependable and secure computing

Detecting phishing web pages with visual similarity assessment based on earth mover's distance (EMD)

Fu, A. Y., Wenyin, L., Deng, X.

Published: 2006

32nd USENIX Security Symposium

Knowledge Expansion and Counterfactual Interaction for Reference-Based Phishing Detection

R. Liu, Y. Lin, Y. Zhang, P. H. Lee, J. S. Dong

Published: 2023

Proc. NIPS

Attention Is All You Need

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, I. Polosukhin

Published: 2017

Real attackers don’t compute gradients: Bridging the gap between adversarial ML research and practice

Giovanni Apruzzese, Hyrum S Anderson, Savino Dambra, David Freeman, Fabio Pierazzi, Kevin A Roundy

Published: 2022

Proc. ESORICS

Attacking logo-based phishing website detectors with adversarial perturbations

J. Lee, Z. Xin, M. N. P. See, K. Sabharwal, G. Apruzzese, D. M. Divakaran

Published: 2023

arxiv

被引用数 1

From ML to LLM: Evaluating the Robustness of Phishing Webpage Detection Models against Adversarial Attacks

Aditya Kulkarni, Vivek Balachandran, Dinil Mon Divakaran, Tamal Das

Published: 2024.7.30

Phishing attacks attempt to deceive users into stealing sensitive information, posing a significant cybersecurity threat. Advances in machine learning (ML) and deep learning (DL) have led to the development of numerous phishing webpage detection solutions, but these models remain vulnerable to adversarial attacks. Evaluating their robustness against adversarial phishing webpages is essential. Existing tools contain datasets of pre-designed phishing webpages for a limited number of brands, and lack diversity in phishing features. To address these challenges, we develop PhishOracle, a tool that generates adversarial phishing webpages by embedding diverse phishing features into legitimate webpages. We evaluate the robustness of three existing task-specific models -- Stack model, VisualPhishNet, and Phishpedia -- against PhishOracle-generated adversarial phishing webpages and observe a significant drop in their detection rates. In contrast, a multimodal large language model (MLLM)-based phishing detector demonstrates stronger robustness against these adversarial attacks but still is prone to evasion. Our findings highlight the vulnerability of phishing detection models to adversarial attacks, emphasizing the need for more robust detection approaches. Furthermore, we conduct a user study to evaluate whether PhishOracle-generated adversarial phishing webpages can deceive users. The results show that many of these phishing webpages evade not only existing detection models but also users. We also develop the PhishOracle web app, allowing users to input a legitimate URL, select relevant phishing features and generate a corresponding phishing webpage. All resources will be made publicly available on GitHub.

プロンプトインジェクションフィッシング検出データセット生成

arxiv

被引用数 1

APWG Symposium on Electronic Crime Research (eCrime)

Multimodal Large Language Models for Phishing Webpage Detection and Identification

Jehyun Lee, Peiyuan Lim, Bryan Hooi, Dinil Mon Divakaran

Published: 2024.8.12

To address the challenging problem of detecting phishing webpages, researchers have developed numerous solutions, in particular those based on machine learning (ML) algorithms. Among these, brand-based phishing detection that uses models from Computer Vision to detect if a given webpage is imitating a well-known brand has received widespread attention. However, such models are costly and difficult to maintain, as they need to be retrained with labeled dataset that has to be regularly and continuously collected. Besides, they also need to maintain a good reference list of well-known websites and related meta-data for effective performance. In this work, we take steps to study the efficacy of large language models (LLMs), in particular the multimodal LLMs, in detecting phishing webpages. Given that the LLMs are pretrained on a large corpus of data, we aim to make use of their understanding of different aspects of a webpage (logo, theme, favicon, etc.) to identify the brand of a given webpage and compare the identified brand with the domain name in the URL to detect a phishing attack. We propose a two-phase system employing LLMs in both phases: the first phase focuses on brand identification, while the second verifies the domain. We carry out comprehensive evaluations on a newly collected dataset. Our experiments show that the LLM-based system achieves a high detection rate at high precision; importantly, it also provides interpretable evidence for the decisions. Our system also performs significantly better than a state-of-the-art brand-based phishing detection system while demonstrating robustness against two known adversarial attacks.

プロンプトインジェクションフィッシング検出 LLM性能評価

SECURECOMM

A layout-similarity-based approach for detecting phishing pages

A. P. Rosiello, E. Kirda, F. Ferrandi

Published: 2007

arxiv

被引用数 1

Multi-SpacePhish: Extending the Evasion-space of Adversarial Attacks against Phishing Website Detectors using Machine Learning

Ying Yuan, Giovanni Apruzzese, Mauro Conti

Published: 2022.10.25

Existing literature on adversarial Machine Learning (ML) focuses either on showing attacks that break every ML model, or defenses that withstand most attacks. Unfortunately, little consideration is given to the actual feasibility of the attack or the defense. Moreover, adversarial samples are often crafted in the "feature-space", making the corresponding evaluations of questionable value. Simply put, the current situation does not allow to estimate the actual threat posed by adversarial attacks, leading to a lack of secure ML systems. We aim to clarify such confusion in this paper. By considering the application of ML for Phishing Website Detection (PWD), we formalize the "evasion-space" in which an adversarial perturbation can be introduced to fool a ML-PWD -- demonstrating that even perturbations in the "feature-space" are useful. Then, we propose a realistic threat model describing evasion attacks against ML-PWD that are cheap to stage, and hence intrinsically more attractive for real phishers. After that, we perform the first statistically validated assessment of state-of-the-art ML-PWD against 12 evasion attacks. Our evaluation shows (i) the true efficacy of evasion attempts that are more likely to occur; and (ii) the impact of perturbations crafted in different evasion-spaces. Our realistic evasion attempts induce a statistically significant degradation (3-10% at p<0.05), and their cheap cost makes them a subtle threat. Notably, however, some ML-PWD are immune to our most realistic attacks (p=0.22). Finally, as an additional contribution of this journal publication, we are the first to consider the intriguing case wherein an attacker introduces perturbations in multiple evasion-spaces at the same time. These new results show that simultaneously applying perturbations in the problem- and feature-space can cause a drop in the detection rate from 0.95 to 0.

悪意のあるウェブサイト検出ポイズニング攻撃シナリオ分析

Inf. Sci.

A new hybrid ensemble feature selection framework for machine learning-based phishing detection system

Kang Leng Chiew, Choon Lin Tan, KokSheik Wong, Kelvin S.C. Yong, Wei King Tiong

Published: 2019

2021 IEEE Conference on Information and Communication Systems (ICICS)

Identi: Identifying and detecting online phishing attacks using deep learning and nlp techniques

K. T. Saleh, Z. A.-N. Al-Makhadmeh, M. Qaddoura

Published: 2021

2011 IEEE fifth international conference on semantic computing

Phishzoo: Detecting phishing websites by looking at them

Afroz, S., Greenstadt, R.

Published: 2011

2017 IEEE 17th International Conference on Communication Technology (ICCT)

Phishing web page detection based on visual similarity features using cnn

S. Liu, F. Chen, S. Jiang, K. Lu, Y. Liu, Q. Li

Published: 2017

Computers & Security

LogoSENSE: A Companion HOG based Logo Detection Scheme for Phishing Web Page and E-mail Brand Recognition

A. S. Bozkir, M. Aydos

Published: 2020

Computers & Security

Phishaod: An automated detection framework for phishing urls based on deep learning

M. Narwaria, S. Roy, S. Das

Published: 2020

Expert systems with applications

Hybrid phishing detection using joint visual and textual identity

C. C. L. Tan, K. L. Chiew, K. S. Yong, Y. Sebastian, J. C. M. Than, W. K. Tiong

Published: 2023

arxiv

被引用数 1

Computing Research Repository (CoRR)

LLMs for Cyber Security: New Opportunities

Dinil Mon Divakaran, Sai Teja Peddinti

Published: 2024.4.17

Large language models (LLMs) are a class of powerful and versatile models that are beneficial to many industries. With the emergence of LLMs, we take a fresh look at cyber security, specifically exploring and summarizing the potential of LLMs in addressing challenging problems in the security and safety domains.

サイバーセキュリティ LLMセキュリティ

AutoCodeRover: Autonomous Program Improvement

Y. Zhang, H. Ruan, Z. Fan, A. Roychoudhury

Published: 2024

arxiv

被引用数 2

You Only Prompt Once: On the Capabilities of Prompt Learning on Large Language Models to Tackle Toxic Content

Xinlei He, Savvas Zannettou, Yun Shen, Yang Zhang

Published: 2023.8.10

The spread of toxic content online is an important problem that has adverse effects on user experience online and in our society at large. Motivated by the importance and impact of the problem, research focuses on developing solutions to detect toxic content, usually leveraging machine learning (ML) models trained on human-annotated datasets. While these efforts are important, these models usually do not generalize well and they can not cope with new trends (e.g., the emergence of new toxic terms). Currently, we are witnessing a shift in the approach to tackling societal issues online, particularly leveraging large language models (LLMs) like GPT-3 or T5 that are trained on vast corpora and have strong generalizability. In this work, we investigate how we can use LLMs and prompt learning to tackle the problem of toxic content, particularly focusing on three tasks; 1) Toxicity Classification, 2) Toxic Span Detection, and 3) Detoxification. We perform an extensive evaluation over five model architectures and eight datasets demonstrating that LLMs with prompt learning can achieve similar or even better performance compared to models trained on these specific tasks. We find that prompt learning achieves around 10\% improvement in the toxicity classification task compared to the baselines, while for the toxic span detection task we find better performance to the best baseline (0.643 vs. 0.640 in terms of $F_1$-score). Finally, for the detoxification task, we find that prompt learning can successfully reduce the average toxicity score (from 0.775 to 0.213) while preserving semantic meaning.

出力の有害度の算出テキストデトキシフィケーションプロンプトリーキング

Proc. IEEE EuroS&P

D-Fence: A Flexible, Efficient, and Comprehensive Phishing Email Detection System

J. Lee, F. Tang, P. Ye, F. Abbasi, P. Hay, D. M. Divakaran

Published: 2021

arxiv

被引用数 1

Security and Privacy in Communication Networks (SecureComm)

ChatSpamDetector: Leveraging Large Language Models for Effective Phishing Email Detection

Takashi Koide, Naoki Fukushi, Hiroki Nakano, Daiki Chiba

Published: 2024.2.28

The proliferation of phishing sites and emails poses significant challenges to existing cybersecurity efforts. Despite advances in malicious email filters and email security protocols, problems with oversight and false positives persist. Users often struggle to understand why emails are flagged as potentially fraudulent, risking the possibility of missing important communications or mistakenly trusting deceptive phishing emails. This study introduces ChatSpamDetector, a system that uses large language models (LLMs) to detect phishing emails. By converting email data into a prompt suitable for LLM analysis, the system provides a highly accurate determination of whether an email is phishing or not. Importantly, it offers detailed reasoning for its phishing determinations, assisting users in making informed decisions about how to handle suspicious emails. We conducted an evaluation using a comprehensive phishing email dataset and compared our system to several LLMs and baseline systems. We confirmed that our system using GPT-4 has superior detection capabilities with an accuracy of 99.70%. Advanced contextual interpretation by LLMs enables the identification of various phishing tactics and impersonations, making them a potentially powerful tool in the fight against email-based phishing threats.

フィッシング検出プロンプトインジェクションメールセキュリティ

Proc. NDSS

Tranco: A research-oriented top sites ranking hardened against manipulation

V. L. Pochat, T. Van Goethem, S. Tajalizadehkhoob, M. Korczynski, W. Joosen

Published: 2019

Proc. IEEE S&P

CrawlPhish: Large-scale analysis of client-side cloaking techniques in phishing

P. Zhang, A. Oest, H. Cho, Z. Sun, R. Johnson, B. Wardman, S. Sarker, A. Kapravelos, T. Bao, R. Wang, Y. Shoshitaishvili, A. Doupe, G.-J. Ahn

Published: 2021

Proc. ACM CCS

I’m spartacus, no, I’m spartacus: Proactively protecting users from phishing by intentionally triggering cloaking behavior

P. Zhang, Z. Sun, S. Kyung, H. W. Behrens, Z. L. Basque, H. Cho, A. Oest, R. Wang, T. Bao, Y. Shoshitaishvili, G.-J. Ahn, A. Doupe

Published: 2022

IEEE Symposium on Security and Privacy (SP)

CFrame: Characterizing and measuring in-the-wild CAPTCHA attacks

H. Dai Nguyen, K. Subramani, B. Acharya, R. Perdisci, P. Vadrevu

Published: 2024

Virustotal

arxiv

被引用数 1

"Are Adversarial Phishing Webpages a Threat in Reality?" Understanding the Users' Perception of Adversarial Webpages

Ying Yuan, Qingying Hao, Giovanni Apruzzese, Mauro Conti, Gang Wang

Published: 2024.4.4

Machine learning based phishing website detectors (ML-PWD) are a critical part of today's anti-phishing solutions in operation. Unfortunately, ML-PWD are prone to adversarial evasions, evidenced by both academic studies and analyses of real-world adversarial phishing webpages. However, existing works mostly focused on assessing adversarial phishing webpages against ML-PWD, while neglecting a crucial aspect: investigating whether they can deceive the actual target of phishing -- the end users. In this paper, we fill this gap by conducting two user studies (n=470) to examine how human users perceive adversarial phishing webpages, spanning both synthetically crafted ones (which we create by evading a state-of-the-art ML-PWD) as well as real adversarial webpages (taken from the wild Web) that bypassed a production-grade ML-PWD. Our findings confirm that adversarial phishing is a threat to both users and ML-PWD, since most adversarial phishing webpages have comparable effectiveness on users w.r.t. unperturbed ones. However, not all adversarial perturbations are equally effective. For example, those with added typos are significantly more noticeable to users, who tend to overlook perturbations of higher visual magnitude (such as replacing the background). We also show that users' self-reported frequency of visiting a brand's website has a statistically negative correlation with their phishing detection accuracy, which is likely caused by overconfidence. We release our resources.

フィッシング検出フィッシング攻撃フィッシング攻撃の検出率

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., Houlsby, N.

Published: 2021

Proc. IEEE CVPR

Generative adversarial perturbations

O. Poursaeed, I. Katsman, B. Gao, S. Belongie

Published: 2018

CoRR

(ab)using images and sounds for indirect instruction injection in multi-modal llms

E. Bagdasaryan, T. Hsieh, B. Nassi, V. Shmatikov

Published: 2023

arXiv

WIPI: A New Web Threat for LLM-Driven Web Agents

F. Wu, S. Wu, Y. Cao, C. Xiao

Published: 2024