Motivated by safety-critical classification problems, we investigate
adversarial attacks against cost-sensitive classifiers. We use current
state-of-the-art adversarially-resistant neural network classifiers [1] as the
underlying models. Cost-sensitive predictions are then achieved via a final
processing step in the feed-forward evaluation of the network. We evaluate the
effectiveness of cost-sensitive classifiers against a variety of attacks and we
introduce a new cost-sensitive attack which performs better than targeted
attacks in some cases. We also explored the measures a defender can take in
order to limit their vulnerability to these attacks. This attacker/defender
scenario is naturally framed as a two-player zero-sum finite game which we
analyze using game theory.