BEACON: Behavioral Malware Classification with Large Language Model Embeddings and Deep Learning

TOP Literature Database BEACON: Behavioral Malware Classification with Large Language Model Embeddings and Deep Learning

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2509.14519

PDF

https://arxiv.org/pdf/2509.14519

Paper Information

Author: Wadduwage Shanika Perera,Haodi Jiang
Published: 9-18-2025
Affiliation: Department of Computer Science, Sam Houston State University
Country: United States of America
Conference: Computing Research Repository (CoRR)

Labels Estimated by AI

Malware Detection Scenario Behavior Analysis Method Evaluation Method

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Malware is becoming increasingly complex and widespread, making it essential to develop more effective and timely detection methods. Traditional static analysis often fails to defend against modern threats that employ code obfuscation, polymorphism, and other evasion techniques. In contrast, behavioral malware detection, which monitors runtime activities, provides a more reliable and context-aware solution. In this work, we propose BEACON, a novel deep learning framework that leverages large language models (LLMs) to generate dense, contextual embeddings from raw sandbox-generated behavior reports. These embeddings capture semantic and structural patterns of each sample and are processed by a one-dimensional convolutional neural network (1D CNN) for multi-class malware classification. Evaluated on the Avast-CTU Public CAPE Dataset, our framework consistently outperforms existing methods, highlighting the effectiveness of LLM-based behavioral embeddings and the overall design of BEACON for robust malware classification.

External Datasets

Avast-CTU Public CAPE Dataset