Modern malware families often rely on domain-generation algorithms (DGAs) to
determine rendezvous points to their command-and-control server. Traditional
defence strategies (such as blacklisting domains or IP addresses) are
inadequate against such techniques due to the large and continuously changing
list of domains produced by these algorithms. This paper demonstrates that a
machine learning approach based on recurrent neural networks is able to detect
domain names generated by DGAs with high precision. The neural models are
estimated on a large training set of domains generated by various malwares.
Experimental results show that this data-driven approach can detect
malware-generated domain names with a F_1 score of 0.971. To put it
differently, the model can automatically detect 93 % of malware-generated
domain names for a false positive rate of 1:100.