Using a previously introduced similarity function for the stream of system
calls generated by a computer, we engineer a program-in-execution classifier
using deep learning methods. Tested on malware classification, it significantly
outperforms current state of the art. We provide a series of performance
measures and tests to demonstrate the capabilities, including measurements from
production use. We show how the system scales linearly with the number of
endpoints. With the system we estimate the total number of malware families
created over the last 10 years as 3450, in line with reasonable economic
constraints. The more limited rate for new malware families than previously
acknowledged implies that machine learning malware classifiers risk being
tested on their training set; we achieve F1 = 0.995 in a test carefully
designed to mitigate this risk.