These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Machine learning is often used for malicious website detection, but an
approach incorporating WebAssembly as a feature has not been explored due to a
limited number of samples, to the best of our knowledge. In this paper, we
propose JABBERWOCK (JAvascript-Based Binary EncodeR by WebAssembly Optimization
paCKer), a tool to generate WebAssembly datasets in a pseudo fashion via
JavaScript. Loosely speaking, JABBERWOCK automatically gathers JavaScript code
in the real world, convert them into WebAssembly, and then outputs vectors of
the WebAssembly as samples for malicious website detection. We also conduct
experimental evaluations of JABBERWOCK in terms of the processing time for
dataset generation, comparison of the generated samples with actual WebAssembly
samples gathered from the Internet, and an application for malicious website
detection. Regarding the processing time, we show that JABBERWOCK can construct
a dataset in 4.5 seconds per sample for any number of samples. Next, comparing
10,000 samples output by JABBERWOCK with 168 gathered WebAssembly samples, we
believe that the generated samples by JABBERWOCK are similar to those in the
real world. We then show that JABBERWOCK can provide malicious website
detection with 99\% F1-score because JABBERWOCK makes a gap between benign and
malicious samples as the reason for the above high score. We also confirm that
JABBERWOCK can be combined with an existing malicious website detection tool to
improve F1-scores. JABBERWOCK is publicly available via GitHub
(https://github.com/c-chocolate/Jabberwock).