AIセキュリティポータル K Program
Learn What You Want to Unlearn: Unlearning Inversion Attacks against Machine Unlearning
Share
Abstract
Machine unlearning has become a promising solution for fulfilling the "right to be forgotten", under which individuals can request the deletion of their data from machine learning models. However, existing studies of machine unlearning mainly focus on the efficacy and efficiency of unlearning methods, while neglecting the investigation of the privacy vulnerability during the unlearning process. With two versions of a model available to an adversary, that is, the original model and the unlearned model, machine unlearning opens up a new attack surface. In this paper, we conduct the first investigation to understand the extent to which machine unlearning can leak the confidential content of the unlearned data. Specifically, under the Machine Learning as a Service setting, we propose unlearning inversion attacks that can reveal the feature and label information of an unlearned sample by only accessing the original and unlearned model. The effectiveness of the proposed unlearning inversion attacks is evaluated through extensive experiments on benchmark datasets across various model architectures and on both exact and approximate representative unlearning approaches. The experimental results indicate that the proposed attack can reveal the sensitive information of the unlearned data. As such, we identify three possible defenses that help to mitigate the proposed attacks, while at the cost of reducing the utility of the unlearned model. The study in this paper uncovers an underexplored gap between machine unlearning and the privacy of unlearned data, highlighting the need for the careful design of mechanisms for implementing unlearning without leaking the information of the unlearned data.
An Overview of Privacy in Machine Learning
Emiliano De Cristofaro
Published: 2020.5.18
The right to be forgotten
J. Rosen
Published: 2011
The california consumer privacy act: Towards a european-style privacy regime in the united states
S. L. Pardau
Published: 2018
Towards making systems forget with machine unlearning
Y. Cao, J. Yang
Published: 2015
Unrolling SGD: Understanding Factors Influencing Machine Unlearning
Anvith Thudi, Gabriel Deza, Varun Chandrasekaran, Nicolas Papernot
Published: 2021.9.28
Machine unlearning
L. Bourtoule, V. Chandrasekaran, C. A. Choquette-Choo, H. Jia, A. Travers, B. Zhang, D. Lie, N. Papernot
Published: 2021
Deltagrad: Rapid retraining of machine learning models
Y. Wu, E. Dobriban, S. Davidson
Published: 2020
Graph unlearning
M. Chen, Z. Zhang, T. Wang, M. Backes, M. Humbert, Y. Zhang
Published: 2022
Adaptive machine unlearning
V. Gupta, C. Jung, S. Neel, A. Roth, S. Sharifi-Malvajerdi, C. Waites
Published: 2021
Machine Unlearning for Random Forests
Jonathan Brophy, Daniel Lowd
Published: 2020.9.12
On the necessity of auditable algorithmic definitions for machine unlearning
A. Thudi, H. Jia, I. Shumailov, N. Papernot
Published: 2022
Hard to forget: Poisoning attacks on certified machine unlearning
N. G. Marchant, B. I. Rubinstein, S. Alfeld
Published: 2022
A Duty to Forget, a Right to be Assured? Exposing Vulnerabilities in Machine Unlearning Services
Hongsheng Hu, Shuo Wang, Jiamin Chang, Haonan Zhong, Ruoxi Sun, Shuang Hao, Haojin Zhu, Minhui Xue
Published: 2023.9.15
When Machine Unlearning Jeopardizes Privacy
Min Chen, Zhikun Zhang, Tianhao Wang, Michael Backes, Mathias Humbert, Yang Zhang
Published: 2020.5.5
Enhanced Membership Inference Attacks against Machine Learning Models
Jiayuan Ye, Aadyaa Maddi, Sasi Kumar Murakonda, Vincent Bindschaedler, Reza Shokri
Published: 2021.11.18
Understanding deep learning (still) requires rethinking generalization
C. Zhang, S. Bengio, M. Hardt, B. Recht, O. Vinyals
Published: 2021
Model inversion attacks that exploit confidence information and basic countermeasures
Matt Fredrikson, Somesh Jha, Thomas Ristenpart
Published: 2015
Hidden poison: Machine unlearning enables camouflaged poisoning attacks
J. Z. Di, J. Douglas, J. Acharya, G. Kamath, A. Sekhari
Published: 2022
Privacy in pharmacogenetics: An end-to-end case study of personalized warfarin dosing
Matthew Fredrikson, Eric Lantz, Somesh Jha, Simon Lin, David Page, Thomas Ristenpart
Published: 2014
Sok: Model inversion attack landscape: Taxonomy, challenges, and future roadmap
S. V. Dibbo
Published: 2023
Updates-Leak: Data Set Inference and Reconstruction Attacks in Online Learning
Ahmed Salem, Apratim Bhattacharya, Michael Backes, Mario Fritz, Yang Zhang
Published: 2019.4.2
Deep learning
I. Goodfellow, Y. Bengio, A. Courville
Published: 2016
R-gap: Recursive gradient attack on privacy
J. Zhu, M. B. Blaschko
Published: 2020
See through gradients: Image batch recovery via gradinversion
Hongxu Yin, Arun Mallya, Arash Vahdat, Jose M Alvarez, Jan Kautz, Pavlo Molchanov
Published: 2021
Gradient Obfuscation Gives a False Sense of Security in Federated Learning
Kai Yue, Richeng Jin, Chau-Wai Wong, Dror Baron, Huaiyu Dai
Published: 2022.6.8
Beyond Inferring Class Representatives: User-Level Privacy Leakage From Federated Learning
Zhibo Wang, Mengkai Song, Zhifei Zhang, Yang Song, Qian Wang, Hairong Qi
Published: 2018.12.3
Label leakage from gradients in distributed machine learning
A. Wainakh, T. Mußig, T. Grube, M. M ¨ uhlh ¨ auser
Published: 2021
Towards Evaluating the Robustness of Neural Networks
Nicholas Carlini, David Wagner
Published: 2016.8.17
The Limitations of Deep Learning in Adversarial Settings
Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z. Berkay Celik, Ananthram Swami
Published: 2015.11.24
Practical black-box attacks against machine learning
N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, A. Swami
Published: 2017
ZOO: Zeroth Order Optimization based Black-box Attacks to Deep Neural Networks without Training Substitute Models
Pin-Yu Chen, Huan Zhang, Yash Sharma, Jinfeng Yi, Cho-Jui Hsieh
Published: 2017.8.14
QEBA: Query-Efficient Boundary-Based Blackbox Attack
Huichen Li, Xiaojun Xu, Xiaolu Zhang, Shuang Yang, Bo Li
Published: 2020.5.29
Learning multiple layers of features from tiny images
Alex Krizhevsky, Geoffrey Hinton
Published: 2009
An analysis of single-layer networks in unsupervised feature learning
A. Coates, A. Ng, H. Lee
Published: 2011
Quo vadis, action recognition? a new model and the kinetics dataset
J. Carreira, A. Zisserman
Published: 2017
Pyramid scene parsing network
H. Zhao, J. Shi, X. Qi, X. Wang, J. Jia
Published: 2017
Rethinking imagenet pre-training
K. He, R. Girshick, P. Dollar
Published: 2019
Deep learning with differential privacy
M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, L. Zhang
Published: 2016
Stealing machine learning models via prediction apis
F. Tramer, F. Zhang, A. Juels, M. K. Reiter, T. Ristenpart
Published: 2016
Membership Leakage in Label-Only Exposures
Zheng Li, Yang Zhang
Published: 2020.7.31
Label-Only Membership Inference Attacks
Christopher A. Choquette-Choo, Florian Tramer, Nicholas Carlini, Nicolas Papernot
Published: 2020.7.29
Systematic Evaluation of Privacy Risks of Machine Learning Models
Liwei Song, Prateek Mittal
Published: 2020.3.24
A pragmatic approach to membership inferences on machine learning models
Y. Long, L. Wang, D. Bu, V. Bindschaedler, X. Wang, H. Tang, C. A. Gunter, K. Chen
Published: 2020
Understanding disparate effects of membership inference attacks and their countermeasures
D. Zhong, H. Sun, J. Xu, N. Gong, W. H. Wang
Published: 2022
Membership inference attacks from first principles
N. Carlini, S. Chien, M. Nasr, S. Song, A. Terzis, F. Tramer
Published: 2022
Very deep convolutional networks for large-scale image recognition
K. Simonyan, A. Zisserman
Published: 2015
Representations of quasi-newton matrices and their use in limited memory methods
R. H. Byrd, J. Nocedal, R. B. Schnabel
Published: 1994
Share