Machine learning models trained on data from the outside world can be
corrupted by data poisoning attacks that inject malicious points into the
models' training sets. A common defense against these attacks is data
sanitization: first filter out anomalous training points before training the
model. In this paper, we develop three attacks that can bypass a broad range of
common data sanitization defenses, including anomaly detectors based on nearest
neighbors, training loss, and singular-value decomposition. By adding just 3%
poisoned data, our attacks successfully increase test error on the Enron spam
detection dataset from 3% to 24% and on the IMDB sentiment classification
dataset from 12% to 29%. In contrast, existing attacks which do not explicitly
account for these data sanitization defenses are defeated by them. Our attacks
are based on two ideas: (i) we coordinate our attacks to place poisoned points
near one another, and (ii) we formulate each attack as a constrained
optimization problem, with constraints designed to ensure that the poisoned
points evade detection. As this optimization involves solving an expensive
bilevel problem, our three attacks correspond to different ways of
approximating this problem, based on influence functions; minimax duality; and
the Karush-Kuhn-Tucker (KKT) conditions. Our results underscore the need to
develop more robust defenses against data poisoning attacks.