Client-Side Zero-Shot LLM Inference for Comprehensive In-Browser URL Analysis

TOP Literature Database Client-Side Zero-Shot LLM Inference for Comprehensive In-Browser URL Analysis

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2506.03656

PDF

https://arxiv.org/pdf/2506.03656

Paper Information

Author: Avihay Cohen
Published: 6-4-2025
Affiliation
Country
Conference: Computing Research Repository (CoRR)

Labels Estimated by AI

Prompt Injection Dynamic Analysis Alignment

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Malicious websites and phishing URLs pose an ever-increasing cybersecurity risk, with phishing attacks growing by 40% in a single year. Traditional detection approaches rely on machine learning classifiers or rule-based scanners operating in the cloud, but these face significant challenges in generalization, privacy, and evasion by sophisticated threats. In this paper, we propose a novel client-side framework for comprehensive URL analysis that leverages zero-shot inference by a local large language model (LLM) running entirely in-browser. Our system uses a compact LLM (e.g., 3B/8B parameters) via WebLLM to perform reasoning over rich context collected from the target webpage, including static code analysis (JavaScript abstract syntax trees, structure, and code patterns), dynamic sandbox execution results (DOM changes, API calls, and network requests),and visible content. We detail the architecture and methodology of the system, which combines a real browser sandbox (using iframes) resistant to common anti-analysis techniques, with an LLM-based analyzer that assesses potential vulnerabilities and malicious behaviors without any task-specific training (zero-shot). The LLM aggregates evidence from multiple sources (code, execution trace, page content) to classify the URL as benign or malicious and to provide an explanation of the threats or security issues identified. We evaluate our approach on a diverse set of benign and malicious URLs, demonstrating that even a compact client-side model can achieve high detection accuracy and insightful explanations comparable to cloud-based solutions, while operating privately on end-user devices. The results show that client-side LLM inference is a feasible and effective solution to web threat analysis, eliminating the need to send potentially sensitive data to cloud services.

External Datasets

200 URLs consisting of 100 known malicious webpages and 100 benign webpages