Learning in adversarial settings is becoming an important task for
application domains where attackers may inject malicious data into the training
set to subvert normal operation of data-driven technologies. Feature selection
has been widely used in machine learning for security applications to improve
generalization and computational efficiency, although it is not clear whether
its use may be beneficial or even counterproductive when training data are
poisoned by intelligent attackers. In this work, we shed light on this issue by
providing a framework to investigate the robustness of popular feature
selection methods, including LASSO, ridge regression and the elastic net. Our
results on malware detection show that feature selection methods can be
significantly compromised under attack (we can reduce LASSO to almost random
choices of feature sets by careful insertion of less than 5% poisoned training
samples), highlighting the need for specific countermeasures.