We develop Neuron Shapley as a new framework to quantify the contribution of
individual neurons to the prediction and performance of a deep network. By
accounting for interactions across neurons, Neuron Shapley is more effective in
identifying important filters compared to common approaches based on activation
patterns. Interestingly, removing just 30 filters with the highest Shapley
scores effectively destroys the prediction accuracy of Inception-v3 on
ImageNet. Visualization of these few critical filters provides insights into
how the network functions. Neuron Shapley is a flexible framework and can be
applied to identify responsible neurons in many tasks. We illustrate additional
applications of identifying filters that are responsible for biased prediction
in facial recognition and filters that are vulnerable to adversarial attacks.
Removing these filters is a quick way to repair models. Enabling all these
applications is a new multi-arm bandit algorithm that we developed to
efficiently estimate Neuron Shapley values.