Adversarial Out-domain Examples for Generative Models

Authors: Dario Pasquini, Marco Mingione, Massimo Bernaschi | Published: 2019-03-07 | Updated: 2019-05-13

2019.03.072025.04.03

Authors: Dario Pasquini, Marco Mingione, Massimo Bernaschi
Published: 2019-03-07 | Updated: 2019-05-13

Source: https://arxiv.org/abs/1903.02926

PDF: https://arxiv.org/pdf/1903.02926

AIにより推定されたラベル

敵対的学習敵対的訓練 Out-of-Distribution検出

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Deep generative models are rapidly becoming a common tool for researchers and developers. However, as exhaustively shown for the family of discriminative models, the test-time inference of deep neural networks cannot be fully controlled and erroneous behaviors can be induced by an attacker. In the present work, we show how a malicious user can force a pre-trained generator to reproduce arbitrary data instances by feeding it suitable adversarial inputs. Moreover, we show that these adversarial latent vectors can be shaped so as to be statistically indistinguishable from the set of genuine inputs. The proposed attack technique is evaluated with respect to various GAN images generators using different architectures, training processes and for both conditional and not-conditional setups.