These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Privacy and interpretability are two important ingredients for achieving
trustworthy machine learning. We study the interplay of these two aspects in
graph machine learning through graph reconstruction attacks. The goal of the
adversary here is to reconstruct the graph structure of the training data given
access to model explanations. Based on the different kinds of auxiliary
information available to the adversary, we propose several graph reconstruction
attacks. We show that additional knowledge of post-hoc feature explanations
substantially increases the success rate of these attacks. Further, we
investigate in detail the differences between attack performance with respect
to three different classes of explanation methods for graph neural networks:
gradient-based, perturbation-based, and surrogate model-based methods. While
gradient-based explanations reveal the most in terms of the graph structure, we
find that these explanations do not always score high in utility. For the other
two classes of explanations, privacy leakage increases with an increase in
explanation utility. Finally, we propose a defense based on a randomized
response mechanism for releasing the explanations, which substantially reduces
the attack success rate. Our code is available at
https://github.com/iyempissy/graph-stealing-attacks-with-explanation