Information systems have widely been the target of malware attacks.
Traditional signature-based malicious program detection algorithms can only
detect known malware and are prone to evasion techniques such as binary
obfuscation, while behavior-based approaches highly rely on the malware
training samples and incur prohibitively high training cost. To address the
limitations of existing techniques, we propose MatchGNet, a heterogeneous Graph
Matching Network model to learn the graph representation and similarity metric
simultaneously based on the invariant graph modeling of the program's execution
behaviors. We conduct a systematic evaluation of our model and show that it is
accurate in detecting malicious program behavior and can help detect malware
attacks with less false positives. MatchGNet outperforms the state-of-the-art
algorithms in malware detection by generating 50% less false positives while
keeping zero false negatives.