Adversarial Binaries for Authorship Identification

TOP 文献データベース Adversarial Binaries for Authorship Identification

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/1809.08316

PDF

https://arxiv.org/pdf/1809.08316

文献情報

作者: Xiaozhu Meng,Barton P. Miller,Somesh Jha
公開日: 2018-9-22
更新日: 2018-11-8
所属機関: Computer Sciences Department, University of Wisconsin - Madison
所属の国: United States of America
会議名: Computing Research Repository (CoRR)

AIにより推定されたラベル

敵対的攻撃手法敵対的学習バイナリ多様化

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Binary code authorship identification determines authors of a binary program. Existing techniques have used supervised machine learning for this task. In this paper, we look this problem from an attacker's perspective. We aim to modify a test binary, such that it not only causes misprediction but also maintains the functionality of the original input binary. Attacks against binary code are intrinsically more difficult than attacks against domains such as computer vision, where attackers can change each pixel of the input image independently and still maintain a valid image. For binary code, even flipping one bit of a binary may cause the binary to be invalid, to crash at the run-time, or to lose the original functionality. We investigate two types of attacks: untargeted attacks, causing misprediction to any of the incorrect authors, and targeted attacks, causing misprediction to a specific one among the incorrect authors. We develop two key attack capabilities: feature vector modification, generating an adversarial feature vector that both corresponds to a real binary and causes the required misprediction, and input binary modification, modifying the input binary to match the adversarial feature vector while maintaining the functionality of the input binary. We evaluated our attack against classifiers trained with a state-of-the-art method for authorship attribution. The classifiers for authorship identification have 91% accuracy on average. Our untargeted attack has a 96% success rate on average, showing that we can effectively suppress authorship signal. Our targeted attack has a 46% success rate on average, showing that it is possible, but significantly more difficult to impersonate a specific programmer's style. Our attack reveals that existing binary code authorship identification techniques rely on code features that are easy to modify, and thus are vulnerable to attacks.