AIセキュリティポータル K Program
PhishLang: A Lightweight, Client-Side Phishing Detection Framework using MobileBERT for Real-Time, Explainable Threat Mitigation
Share
Abstract
In this paper, we introduce PhishLang, an open-source, lightweight language model specifically designed for phishing website detection through contextual analysis of the website. Unlike traditional heuristic or machine learning models that rely on static features and struggle to adapt to new threats, and deep learning models that are computationally intensive, our model leverages MobileBERT, a fast and memory-efficient variant of the BERT architecture, to learn granular features characteristic of phishing attacks. PhishLang operates with minimal data preprocessing and offers performance comparable to leading deep learning anti-phishing tools, while being significantly faster and less resource-intensive. Over a 3.5-month testing period, PhishLang successfully identified 25,796 phishing URLs, many of which were undetected by popular antiphishing blocklists, thus demonstrating its potential to enhance current detection measures. Capitalizing on PhishLang's resource efficiency, we release the first open-source fully client-side Chromium browser extension that provides inference locally without requiring to consult an online blocklist and can be run on low-end systems with no impact on inference times. Our implementation not only outperforms prevalent (server-side) phishing tools, but is significantly more effective than the limited commercial client-side measures available. Furthermore, we study how PhishLang can be integrated with GPT-3.5 Turbo to create explainable blocklisting -- which, upon detection of a website, provides users with detailed contextual information about the features that led to a website being marked as phishing.
Visualphishnet: Zero-day phishingwebsite detection byvisual similarity
Sahar Abdelnabi, Katharina Krombholz, Mario Fritz
Published: 2020
Alice in warningland: a {Large-Scale} field study of browser security warning effectiveness
Devdatta Akhawe, Adrienne Porter Felt
Published: 2013
Phishing environments, techniques, and countermeasures: A survey
Ahmed Aleroud, Lina Zhou
Published: 2017
Artificial hallucinations in ChatGPT: implications in scientific writing
Hussam Alkaissi, Samy I McFarlane
Published: 2023
Phishing websites detection based on phishing characteristics in the webpage source code
Mona Ghotaish Alkhozae, Omar Abdullah Batarfi
Published: 2011
Why phishing still works: User strategies for combating phishing attacks
Mohamed Alsharnouby, Furkan Alaca, Sonia Chiasson
Published: 2015
Modeling realistic adversarial attacks against network intrusion detection systems
G. Apruzzese, M. Andreolini, L. Ferretti, M. Marchetti, M. Colajanni
Published: 2022
Pwdgan: Generating adversarial malicious url examples for deceiving black-box phishing website detector using gans
Trinh Nguyen Bac, Phan The Duy, Van-Hau Pham
Published: 2021
Deepphish: simulating malicious ai
Alejandro Correa Bahnsen, Ivan Torroledo, Luis David Camacho, Sergio Villegas
Published: 2018
A comprehensive survey of ai-enabled phishing attacks detection techniques
A. Basit, M. Zafar, X. Liu, A. R. Javed, Z. Jalil, K. Kifayat
Published: 2021
Rapid Cyber-bullying detection method using Compact BERT Models
Mitra Behzadi, Ian G Harris, Ali Derakhshan
Published: 2021
BERT for question generation
Ying-Hong Chan, Yao-Chung Fan
Published: 2019
Interface design elements for anti-phishing systems
Yan Chen, Fatemeh Zahedi, Ahmed Abbasi
Published: 2011
A survey of phishing attacks: their types, vectors and technical approaches
Kang Leng Chiew, Kelvin Sheng Chek Yong, Choon Lin Tan
Published: 2018
Understanding toxicity triggers on Reddit in the context of Singapore
Yun Yu Chong, Haewoon Kwak
Published: 2022
An empirical study on the usage of bert models for code completion
Matteo Ciniselli, Nathan Cooper, Luca Pascarella, Denys Poshyvanyk, Massimiliano Di Penta, Gabriele Bavota
Published: 2021
Deltaphish: Detecting phishing webpages in compromised websites
Igino Corona, Battista Biggio, Matteo Contini, Luca Piras, Roberto Corda, Mauro Mereu, Guido Mureddu, Davide Ariu, Fabio Roli
Published: 2017
Static malware detection using stacked BiLSTM and GPT-2
Denız Demırcı, Cengiz Acarturk
Published: 2022
Explanations in warning dialogs to help users defend against phishing attacks
Giuseppe Desolda, Joseph Aneke, Carmelo Ardito, Rosa Lanzilotti, Maria Francesca Costabile
Published: 2023
Deep learning for phishing detection: Taxonomy, current challenges and future directions
Nguyet Quang Do, Ali Selamat, Ondrej Krejcar, Enrique Herrera-Viedma, Hamido Fujita
Published: 2022
Behavioral response to phishing risk
Julie S Downs, Mandy Holbrook, Lorrie Faith Cranor
Published: 2007
"Do Users fall for Real Adversarial Phishing?" Investigating the Human response to Evasive Webpages
Ajka Draganovic, Savino Dambra, Javier Aldana Iuit, Kevin Roundy, Giovanni Apruzzese
Published: 2023.11.28
Jast: Fully syntactic detection of malicious (obfuscated) javascript
Aurore Fass, Robert P Krawczyk, Michael Backes, Ben Stock
Published: 2018
HinPhish: An effective phishing detection approach based on heterogeneous information networks
Bingyang Guo, Yunyi Zhang, Chengxi Xu, Fan Shi, Yuwei Li, Min Zhang
Published: 2021
Defending against phishing attacks: taxonomy of methods, current issues and future directions
Brij B Gupta, Nalin AG Arachchilage, Kostas E Psannis
Published: 2018
Research on web application vulnerability scanning system based on fingerprint feature
Hao He, Lulu Chen, Wenpu Guo
Published: 2017
Devising and Detecting Phishing: Large Language Models vs. Smaller Human Models
Fredrik Heiding, Bruce Schneier, Arun Vishwanath, Jeremy Bernstein, Peter S. Park
Published: 2023.8.24
From e-commerce to social commerce: A close look at design features
Zhao Huang, Morad Benyoucef
Published: 2013
Phishing detection: analysis of visual similarity based approaches
Ankit Kumar Jain, Brij B Gupta
Published: 2017
Learning and evaluating contextual embedding of source code
Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, Kensen Shi
Published: 2020
CAPTCHA phishing: a practical attack on human interaction proofing
Le Kang, Ji Xiang
Published: 2010
ChatGPT: Jack of all trades, master of none
Jan Kocoń, Igor Cichecki, Oliwier Kaszyca, Mateusz Kochanek, Dominika Szydło, Joanna Baran, Julita Bielaniewicz, Marcin Gruza, Arkadiusz Janz, Kamil Kanclerz
Published: 2023
Don’t work. Can’t work? Why it’s time to rethink security warnings
Kat Krol, Matthew Moroz, M Angela Sasse
Published: 2012
URLNet: Learning a URL Representation with Deep Learning for Malicious URL Detection
Hung Le, Quang Pham, Doyen Sahoo, Steven C. H. Hoi
Published: 2018.2.9
Building robust phishing detection system: an empirical analysis
J. Lee, P. Ye, R. Liu, D. M. Divakaran, M. C. Chan
Published: 2020
A stacking model using URL and HTML features for phishing webpage detection
Y. Li, Z. Yang, X. Chen, H. Yuan, W. Liu
Published: 2019
Cracking classifiers for evasion: a case study on the google’s phishing pages filter
Bin Liang, Miaoqiang Su, Wei You, Wenchang Shi, Gang Yang
Published: 2016
Phishpedia: a hybrid deep learning based approach to visually identify phishing webpages
Lin, Y., Liu, R., Divakaran, D.M., Ng, J.Y., Chan, Q.Z., Lu, Y., Si, Y., Zhang, F., Dong, J.S.
Published: 2021
Inferring phishing intention via webpage appearance and dynamics: A deep vision based approach
Ruofan Liu, Yun Lin, Xianglin Yang, Siang Hwee Ng, Dinil Mon Divakaran, Jin Song Dong
Published: 2022
Hands off my data: Users’ security concerns and intention to adopt privacy enhancing technologies
Federico Mangiò, Daniela Andreini, Giuseppe Pedeliento
Published: 2020
Phishing websites features
Rami M Mohammad, Fadi Thabtah, Lee McCluskey
Published: 2015
A comparison of features in a crowdsourced phishing warning system
Christopher Nguyen, Matthew L Jensen, Alexandra Durcikova, Ryan T Wright
Published: 2021
Detecting phishing web sites: A heuristic URL-based approach
Luong Anh Tuan Nguyen, Ba Lam To, Huu Khuong Nguyen, Minh Hoang Nguyen
Published: 2013
Phishtime: Continuous longitudinal measurement of the effectiveness of anti-phishing blacklists
Adam Oest, Yeganeh Safaei, Penghui Zhang, Brad Wardman, Kevin Tyers, Yan Shoshitaishvili, Adam Doupé
Published: 2020
Inside a phisher’s mind: Understanding the anti-phishing ecosystem through phishing kit analysis
Adam Oest, Yeganeh Safei, Adam Doupé, Gail-Joon Ahn, Brad Wardman, Gary Warner
Published: 2018
Sunrise to sunset: Analyzing the end-to-end life cycle and effectiveness of phishing attacks at scale
Adam Oest, Penghui Zhang, Brad Wardman, Eric Nunes, Jakub Burgis, Ali Zand, Kurt Thomas, Adam Doupé, Gail-Joon Ahn
Published: 2020
Towards adversarial phishing detection
Thomas Kobber Panum, Kaspar Hageman, René Rydhof Hansen, Jens Myrup Pedersen
Published: 2020
Opening the blackbox of virustotal: Analyzing online phishing scan engines
Peng Peng, Limin Yang, Linhai Song, Gang Wang
Published: 2019
Intriguing properties of adversarial ML attacks in the problem space
Fabio Pierazzi, Feargus Pendlebury, Jacopo Cortellazzi, Lorenzo Cavallaro
Published: 2020
Deep Dive into Client-Side Anti-Phishing: A Longitudinal Study Bridging Academia and Industry
Rana Pourmohamad, Steven Wirsz, Adam Oest, Tiffany Bao, Yan Shoshitaishvili, Ruoyu Wang, Adam Doupé, Rida A Bazzi
Published: 2024
Language models are unsupervised multitask learners
A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever
Published: 2019
MalBERTv2: Code aware BERT-based model for malware identification
Abir Rahali, Moulay A Akhloufi
Published: 2023
A brick wall, a locked door, and a bandit: a physical security metaphor for firewall warnings
Fahimeh Raja, Kirstie Hawkey, Steven Hsu, Kai-Le Clement Wang, Konstantin Beznosov
Published: 2011
Evaluating the effectiveness of phishing reports on twitter
Sayak Saha Roy, Unique Karanjit, Shirin Nilizadeh
Published: 2021
Phishing in the Free Waters:A Study of Phishing Attacks Created using Free Website Building Services
Sayak Saha Roy, Unique Karanjit, Shirin Nilizadeh
Published: 2023
Machine learning based phishing detection from URLs
Ozgur Koray Sahingoz, Ebubekir Buber, Onder Demir, Banu Diri
Published: 2019
Adversarial sampling attacks against phishing detection
Hossein Shirazi, Bruhadeshwar Bezawada, Indrakshi Ray, Charles Anderson
Published: 2019
The Promise and Perils of Google’s Bard for Scientific Research
SM Siad
Published: 2023
I’m not a human: Breaking the Google reCAPTCHA
Suphannee Sivakorn, Jason Polakis, Angelos D Keromytis
Published: 2016
Systems and methods for risk rating and pro-actively detecting malicious online ads
Jayesh Sreedharan, Rahul Mohandas
Published: 2016
PhishInPatterns: measuring elicited user interactions at scale on phishing websites
Karthika Subramani, William Melicher, Oleksii Starov, Phani Vadrevu, Roberto Perdisci
Published: 2022
Recent survey of various defense mechanisms against phishing attacks
Aakanksha Tewari, AK Jain, BB Gupta
Published: 2016
Improving Robustness of ML Classifiers against Realizable Evasion Attacks Using Conserved Features
Liang Tong, Bo Li, Chen Hajaj, Chaowei Xiao, Ning Zhang, Yevgeniy Vorobeychik
Published: 2017.8.28
QRishing: The susceptibility of smartphone users to QR code phishing attacks
Timothy Vidas, Emmanuel Owusu, Shuai Wang, Cheng Zeng, Lorrie Faith Cranor, Nicolas Christin
Published: 2013
Systematic literature review on usability of firewall configuration
Artem Voronkov, Leonardo Horn Iwaya, Leonardo A Martucci, Stefan Lindskog
Published: 2017
Malware detection and classification using fastText and BERT
Salih Yesir, İbrahim Soğukpinar
Published: 2021
Multi-SpacePhish: Extending the Evasion-space of Adversarial Attacks against Phishing Website Detectors using Machine Learning
Ying Yuan, Giovanni Apruzzese, Mauro Conti
Published: 2022.10.25
Multi-SpacePhish: Extending the Evasion-space of Adversarial Attacks against Phishing Website Detectors using Machine Learning
Ying Yuan, Giovanni Apruzzese, Mauro Conti
Published: 2022.10.25
Understanding the Users’ Perception of Adversarial Webpages
Ying Yuan, Qingying Hao, Giovanni Apruzzese, Mauro Conti, Gang Wang
Published: 2024
Semantics-aware BERT for language understanding
Zhuosheng Zhang, Yuwei Wu, Hai Zhao, Zuchao Li, Shuailiang Zhang, Xi Zhou, Xiang Zhou
Published: 2020
Share