Activation Analysis of a Byte-Based Deep Neural Network for Malware Classification

TOP 文献データベース Activation Analysis of a Byte-Based Deep Neural Network for Malware Classification

IEEE Symposium on Security and Privacy Workshops

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/1903.04717

PDF

https://arxiv.org/pdf/1903.04717

文献情報

作者: Scott E. Coull,Christopher Gardner
公開日: 2019-3-12
更新日: 2019-3-20
所属機関: FireEye, Inc.
所属の国: United States of America
会議名: IEEE Symposium on Security and Privacy Workshops

AIにより推定されたラベル

CNNを用いたマルウェア分類特徴の相互依存性マルウェア分類のためのデータセット

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Feature engineering is one of the most costly aspects of developing effective machine learning models, and that cost is even greater in specialized problem domains, like malware classification, where expert skills are necessary to identify useful features. Recent work, however, has shown that deep learning models can be used to automatically learn feature representations directly from the raw, unstructured bytes of the binaries themselves. In this paper, we explore what these models are learning about malware. To do so, we examine the learned features at multiple levels of resolution, from individual byte embeddings to end-to-end analysis of the model. At each step, we connect these byte-oriented activations to their original semantics through parsing and disassembly of the binary to arrive at human-understandable features. Through our results, we identify several interesting features learned by the model and their connection to manually-derived features typically used by traditional machine learning models. Additionally, we explore the impact of training data volume and regularization on the quality of the learned features and the efficacy of the classifiers, revealing the somewhat paradoxical insight that better generalization does not necessarily result in better performance for byte-based malware classifiers.