These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Adversarial patches exemplify the tangible manifestation of the threat posed
by adversarial attacks on Machine Learning (ML) models in real-world scenarios.
Robustness against these attacks is of the utmost importance when designing
computer vision applications, especially for safety-critical domains such as
CCTV systems. In most practical situations, monitoring open spaces requires
multi-view systems to overcome acquisition challenges such as occlusion
handling. Multiview object systems are able to combine data from multiple
views, and reach reliable detection results even in difficult environments.
Despite its importance in real-world vision applications, the vulnerability of
multiview systems to adversarial patches is not sufficiently investigated. In
this paper, we raise the following question: Does the increased performance and
information sharing across views offer as a by-product robustness to
adversarial patches? We first conduct a preliminary analysis showing promising
robustness against off-the-shelf adversarial patches, even in an extreme
setting where we consider patches applied to all views by all persons in
Wildtrack benchmark. However, we challenged this observation by proposing two
new attacks: (i) In the first attack, targeting a multiview CNN, we maximize
the global loss by proposing gradient projection to the different views and
aggregating the obtained local gradients. (ii) In the second attack, we focus
on a Transformer-based multiview framework. In addition to the focal loss, we
also maximize the transformer-specific loss by dissipating its attention
blocks. Our results show a large degradation in the detection performance of
victim multiview systems with our first patch attack reaching an attack success
rate of 73% , while our second proposed attack reduced the performance of its
target detector by 62%