These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Adversarial patch-based attacks have shown to be a major deterrent towards
the reliable use of machine learning models. These attacks involve the
strategic modification of localized patches or specific image areas to deceive
trained machine learning models. In this paper, we propose
\textit{DefensiveDR}, a practical mechanism using a dimensionality reduction
technique to thwart such patch-based attacks. Our method involves projecting
the sample images onto a lower-dimensional space while retaining essential
information or variability for effective machine learning tasks. We perform
this using two techniques, Singular Value Decomposition and t-Distributed
Stochastic Neighbor Embedding. We experimentally tune the variability to be
preserved for optimal performance as a hyper-parameter. This dimension
reduction substantially mitigates adversarial perturbations, thereby enhancing
the robustness of the given machine learning model. Our defense is
model-agnostic and operates without assumptions about access to model decisions
or model architectures, making it effective in both black-box and white-box
settings. Furthermore, it maintains accuracy across various models and remains
robust against several unseen patch-based attacks. The proposed defensive
approach improves the accuracy from 38.8\% (without defense) to 66.2\% (with
defense) when performing LaVAN and GoogleAp attacks, which supersedes that of
the prominent state-of-the-art like LGS (53.86\%) and Jujutsu (60\%).