Clustering algorithms are used in a large number of applications and play an
important role in modern machine learning-- yet, adversarial attacks on
clustering algorithms seem to be broadly overlooked unlike supervised learning.
In this paper, we seek to bridge this gap by proposing a black-box adversarial
attack for clustering models for linearly separable clusters. Our attack works
by perturbing a single sample close to the decision boundary, which leads to
the misclustering of multiple unperturbed samples, named spill-over adversarial
samples. We theoretically show the existence of such adversarial samples for
the K-Means clustering. Our attack is especially strong as (1) we ensure the
perturbed sample is not an outlier, hence not detectable, and (2) the exact
metric used for clustering is not known to the attacker. We theoretically
justify that the attack can indeed be successful without the knowledge of the
true metric. We conclude by providing empirical results on a number of
datasets, and clustering algorithms. To the best of our knowledge, this is the
first work that generates spill-over adversarial samples without the knowledge
of the true metric ensuring that the perturbed sample is not an outlier, and
theoretically proves the above.