Malware Analysis with Symbolic Execution and Graph Kernel

TOP 文献データベース Malware Analysis with Symbolic Execution and Graph Kernel

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2204.05632

PDF

https://arxiv.org/pdf/2204.05632

文献情報

作者: Charles-Henry Bertrand Van Ouytsel;Axel Legay
公開日: 2022-4-12
所属機関: INGI, ICTEAM, Universit'e Catholique de Louvain
所属の国: Belgium
会議名: Nordic Conference on Secure IT Systems (NordSec)

AIにより推定されたラベル

機械学習の応用グラフカーネルアルゴリズム設計

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Malware analysis techniques are divided into static and dynamic analysis. Both techniques can be bypassed by circumvention techniques such as obfuscation. In a series of works, the authors have promoted the use of symbolic executions combined with machine learning to avoid such traps. Most of those works rely on natural graph-based representations that can then be plugged into graph-based learning algorithms such as Gspan. There are two main problems with this approach. The first one is in the cost of computing the graph. Indeed, working with graphs requires one to compute and representing the entire state-space of the file under analysis. As such computation is too cumbersome, the techniques often rely on developing strategies to compute a representative subgraph of the behaviors. Unfortunately, efficient graph-building strategies remain weakly explored. The second problem is in the classification itself. Graph-based machine learning algorithms rely on comparing the biggest common structures. This sidelines small but specific parts of the malware signature. In addition, it does not allow us to work with efficient algorithms such as support vector machine. We propose a new efficient open source toolchain for machine learning-based classification. We also explore how graph-kernel techniques can be used in the process. We focus on the 1-dimensional Weisfeiler-Lehman kernel, which can capture local similarities between graphs. Our experimental results show that our approach outperforms existing ones by an impressive factor.