These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Despite significant advancements in the area, adversarial robustness remains
a critical challenge in systems employing machine learning models. The removal
of adversarial perturbations at inference time, known as adversarial
purification, has emerged as a promising defense strategy. To achieve this,
state-of-the-art methods leverage diffusion models that inject Gaussian noise
during a forward process to dilute adversarial perturbations, followed by a
denoising step to restore clean samples before classification. In this work, we
propose FlowPure, a novel purification method based on Continuous Normalizing
Flows (CNFs) trained with Conditional Flow Matching (CFM) to learn mappings
from adversarial examples to their clean counterparts. Unlike prior
diffusion-based approaches that rely on fixed noise processes, FlowPure can
leverage specific attack knowledge to improve robustness under known threats,
while also supporting a more general stochastic variant trained on Gaussian
perturbations for settings where such knowledge is unavailable. Experiments on
CIFAR-10 and CIFAR-100 demonstrate that our method outperforms state-of-the-art
purification-based defenses in preprocessor-blind and white-box scenarios, and
can do so while fully preserving benign accuracy in the former. Moreover, our
results show that not only is FlowPure a highly effective purifier but it also
holds a strong potential for adversarial detection, identifying
preprocessor-blind PGD samples with near-perfect accuracy.