Detecting malicious PDF using CNN

TOP 文献データベース Detecting malicious PDF using CNN

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2007.12729

PDF

https://arxiv.org/pdf/2007.12729

文献情報

作者: Raphael Fettaya,Yishay Mansour
公開日: 2020-7-25
更新日: 2020-8-2
所属機関: Tel Aviv University
所属の国: Israel
会議名: Computing Research Repository (CoRR)

AIにより推定されたラベル

深層学習オンラインマルウェア検出性能評価

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Malicious PDF files represent one of the biggest threats to computer security. To detect them, significant research has been done using handwritten signatures or machine learning based on manual feature extraction. Those approaches are both time-consuming, require significant prior knowledge and the list of features has to be updated with each newly discovered vulnerability. In this work, we propose a novel algorithm that uses an ensemble of Convolutional Neural Network (CNN) on the byte level of the file, without any handcrafted features. We show, using a data set of 90000 files downloadable online, that our approach maintains a high detection rate (94%) of PDF malware and even detects new malicious files, still undetected by most antiviruses. Using automatically generated features from our CNN network, and applying a clustering algorithm, we also obtain high similarity between the antiviruses' labels and the resulting clusters.