These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Black-box query-based attacks constitute significant threats to Machine
Learning as a Service (MLaaS) systems since they can generate adversarial
examples without accessing the target model's architecture and parameters.
Traditional defense mechanisms, such as adversarial training, gradient masking,
and input transformations, either impose substantial computational costs or
compromise the test accuracy of non-adversarial inputs. To address these
challenges, we propose an efficient defense mechanism, PuriDefense, that
employs random patch-wise purifications with an ensemble of lightweight
purification models at a low level of inference cost. These models leverage the
local implicit function and rebuild the natural image manifold. Our theoretical
analysis suggests that this approach slows down the convergence of query-based
attacks by incorporating randomness into purifications. Extensive experiments
on CIFAR-10 and ImageNet validate the effectiveness of our proposed
purifier-based defense mechanism, demonstrating significant improvements in
robustness against query-based attacks.