These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Decision-based attacks construct adversarial examples against a machine
learning (ML) model by making only hard-label queries. These attacks have
mainly been applied directly to standalone neural networks. However, in
practice, ML models are just one component of a larger learning system. We find
that by adding a single preprocessor in front of a classifier, state-of-the-art
query-based attacks are up to 7$\times$ less effective at attacking a
prediction pipeline than at attacking the model alone. We explain this
discrepancy by the fact that most preprocessors introduce some notion of
invariance to the input space. Hence, attacks that are unaware of this
invariance inevitably waste a large number of queries to re-discover or
overcome it. We, therefore, develop techniques to (i) reverse-engineer the
preprocessor and then (ii) use this extracted information to attack the
end-to-end system. Our preprocessors extraction method requires only a few
hundred queries, and our preprocessor-aware attacks recover the same efficacy
as when attacking the model alone. The code can be found at
https://github.com/google-research/preprocessor-aware-black-box-attack.