Deciphering Malware's use of TLS (without Decryption)

TOP Literature Database Deciphering Malware's use of TLS (without Decryption)

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/1607.01639

PDF

https://arxiv.org/pdf/1607.01639

Paper Information

Author: Blake Anderson,Subharthi Paul,David McGrew
Published: 7-6-2016
Affiliation: Cisco
Country: United States of America
Conference: J. Comput. Virol. Hacking Tech.

Labels Estimated by AI

TLS Client Configuration Secure Communication Channel Data Extraction and Analysis

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

The use of TLS by malware poses new challenges to network threat detection because traditional pattern-matching techniques can no longer be applied to its messages. However, TLS also introduces a complex set of observable data features that allow many inferences to be made about both the client and the server. We show that these features can be used to detect and understand malware communication, while at the same time preserving the privacy of benign uses of encryption. These data features also allow for accurate malware family attribution of network communication, even when restricted to a single, encrypted flow. To demonstrate this, we performed a detailed study of how TLS is used by malware and enterprise applications. We provide a general analysis on millions of TLS encrypted flows, and a targeted study on 18 malware families composed of thousands of unique malware samples and ten-of-thousands of malicious TLS flows. Importantly, we identify and accommodate the bias introduced by the use of a malware sandbox. The performance of a malware classifier is correlated with a malware family's use of TLS, i.e., malware families that actively evolve their use of cryptography are more difficult to classify. We conclude that malware's usage of TLS is distinct from benign usage in an enterprise setting, and that these differences can be effectively used in rules and machine learning classifiers.

External Datasets

malware traffic collected from August 2015 to May 2016

enterprise traffic collected during a 4 day period in May 2016

1,500,005 TLS flows from an enterprise network

133,744 TLS flows initiated by malicious programs