AIセキュリティポータル K Program
Reinforcement Learning-Based Black-Box Model Inversion Attacks
Share
Abstract
Model inversion attacks are a type of privacy attack that reconstructs private data used to train a machine learning model, solely by accessing the model. Recently, white-box model inversion attacks leveraging Generative Adversarial Networks (GANs) to distill knowledge from public datasets have been receiving great attention because of their excellent attack performance. On the other hand, current black-box model inversion attacks that utilize GANs suffer from issues such as being unable to guarantee the completion of the attack process within a predetermined number of query accesses or achieve the same level of performance as white-box attacks. To overcome these limitations, we propose a reinforcement learning-based black-box model inversion attack. We formulate the latent space search as a Markov Decision Process (MDP) problem and solve it with reinforcement learning. Our method utilizes the confidence scores of the generated images to provide rewards to an agent. Finally, the private data can be reconstructed using the latent vectors found by the agent trained in the MDP. The experiment results on various datasets and models demonstrate that our attack successfully recovers the private information of the target model by achieving state-of-the-art attack performance. We emphasize the importance of studies on privacy-preserving machine learning by proposing a more advanced black-box model inversion attack.
Mirror: Model inversion for deep learning network with high fidelity
S. An, G. Tao, Q. Xu, Y. Liu, G. Shen, Y. Yao, J. Xu, X. Zhang
Published: 2022
Rethinking the truly unsupervised image-to-image translation
Kyungjune Baek, Yunjey Choi, Youngjung Uh, Jaejun Yoo, Hyunjung Shim
Published: 2021
The arcade learning environment: An evaluation platform for general agents
Marc G Bellemare, Yavar Naddaf, Joel Veness, Michael Bowling
Published: 2013
Knowledge-enriched distributional model inversion attacks
Si Chen, Mostafa Kahla, Ruoxi Jia, Guo-Jun Qi
Published: 2021
Know you at one glance: A compact vector representation for low-shot learning
Yu Cheng, Jian Zhao, Zhecan Wang, Yan Xu, Karlekar Jayashree, Shengmei Shen, Jiashi Feng
Published: 2017
Model inversion attacks that exploit confidence information and basic countermeasures
Matt Fredrikson, Somesh Jha, Thomas Ristenpart
Published: 2015
Privacy in pharmacogenetics: An end-to-end case study of personalized warfarin dosing
Matthew Fredrikson, Eric Lantz, Somesh Jha, Simon Lin, David Page, Thomas Ristenpart
Published: 2014
Addressing function approximation error in actor-critic methods
Scott Fujimoto, Herke van Hoof, David Meger
Published: 2018
Generative adversarial nets
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio
Published: 2014
Ms-celeb-1m: A dataset and benchmark for large-scale face recognition
Yandong Guo, Lei Zhang, Yuxiao Hu, Xiaodong He, Jianfeng Gao
Published: 2016
Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor
Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, Sergey Levine
Published: 2018
Deep residual learning for image recognition
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
Published: 2016
Label-only model inversion attacks via boundary repulsion
Mostafa Kahla, Si Chen, Hoang Anh Just, Ruoxi Jia
Published: 2022
A style-based generator architecture for generative adversarial networks
T. Karras, S. Laine, T. Aila
Published: 2019
Continuous control with deep reinforcement learning
Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra
Published: 2016
Deep Learning Face Attributes in the Wild
Ziwei Liu, Ping Luo, Xiaogang Wang, Xiaoou Tang
Published: 2015
Human-level control through deep reinforcement learning
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski
Published: 2015
A data-driven approach to cleaning large face datasets
Hong-Wei Ng, Stefan Winkler
Published: 2014
Scaling up biologically-inspired computer vision: A case study in unconstrained face recognition on facebook
Nicolas Pinto, Zak Stone, Todd Zickler, David Cox
Published: 2011
Updates-Leak: Data Set Inference and Reconstruction Attacks in Online Learning
Ahmed Salem, Apratim Bhattacharya, Michael Backes, Mario Fritz, Yang Zhang
Published: 4.2.2019
Deterministic policy gradient algorithms
David Silver, Guy Lever, Nicolas Heess, Thomas Degris, Daan Wierstra, Martin Riedmiller
Published: 2014
Very deep convolutional networks for large-scale image recognition
K. Simonyan, A. Zisserman
Published: 2015
Variational model inversion attacks
Kuan-Chieh Wang, YAN FU, Ke Li, Ashish Khisti, Richard Zemel, Alireza Makhzani
Published: 2021
The Secret Revealer: Generative Model-Inversion Attacks Against Deep Neural Networks
Yuheng Zhang, Ruoxi Jia, Hengzhi Pei, Wenxiao Wang, Bo Li, Dawn Song
Published: 11.17.2019
Share