These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Malicious websites and phishing URLs pose an ever-increasing cybersecurity
risk, with phishing attacks growing by 40% in a single year. Traditional
detection approaches rely on machine learning classifiers or rule-based
scanners operating in the cloud, but these face significant challenges in
generalization, privacy, and evasion by sophisticated threats. In this paper,
we propose a novel client-side framework for comprehensive URL analysis that
leverages zero-shot inference by a local large language model (LLM) running
entirely in-browser. Our system uses a compact LLM (e.g., 3B/8B parameters) via
WebLLM to perform reasoning over rich context collected from the target
webpage, including static code analysis (JavaScript abstract syntax trees,
structure, and code patterns), dynamic sandbox execution results (DOM changes,
API calls, and network requests),and visible content. We detail the
architecture and methodology of the system, which combines a real browser
sandbox (using iframes) resistant to common anti-analysis techniques, with an
LLM-based analyzer that assesses potential vulnerabilities and malicious
behaviors without any task-specific training (zero-shot). The LLM aggregates
evidence from multiple sources (code, execution trace, page content) to
classify the URL as benign or malicious and to provide an explanation of the
threats or security issues identified. We evaluate our approach on a diverse
set of benign and malicious URLs, demonstrating that even a compact client-side
model can achieve high detection accuracy and insightful explanations
comparable to cloud-based solutions, while operating privately on end-user
devices. The results show that client-side LLM inference is a feasible and
effective solution to web threat analysis, eliminating the need to send
potentially sensitive data to cloud services.
External Datasets
200 URLs consisting of 100 known malicious webpages and 100 benign webpages