TOP 文献データベース Softmax-based Classification is k-means Clustering: Formal Proof, Consequences for Adversarial Attacks, and Improvement through Centroid Based Tailoring
arxiv
Softmax-based Classification is k-means Clustering: Formal Proof, Consequences for Adversarial Attacks, and Improvement through Centroid Based Tailoring
We formally prove the connection between k-means clustering and the
predictions of neural networks based on the softmax activation layer. In
existing work, this connection has been analyzed empirically, but it has never
before been mathematically derived. The softmax function partitions the
transformed input space into cones, each of which encompasses a class. This is
equivalent to putting a number of centroids in this transformed space at equal
distance from the origin, and k-means clustering the data points by proximity
to these centroids. Softmax only cares in which cone a data point falls, and
not how far from the centroid it is within that cone. We formally prove that
networks with a small Lipschitz modulus (which corresponds to a low
susceptibility to adversarial attacks) map data points closer to the cluster
centroids, which results in a mapping to a k-means-friendly space. To leverage
this knowledge, we propose Centroid Based Tailoring as an alternative to the
softmax function in the last layer of a neural network. The resulting Gauss
network has similar predictive accuracy as traditional networks, but is less
susceptible to one-pixel attacks; while the main contribution of this paper is
theoretical in nature, the Gauss network contributes empirical auxiliary
benefits.