These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Machine unlearning enables pre-trained models to eliminate the effects of
partial training samples. Previous research has mainly focused on proposing
efficient unlearning strategies. However, the verification of machine
unlearning, or in other words, how to guarantee that a sample has been
successfully unlearned, has been overlooked for a long time. Existing
verification schemes typically rely on machine learning attack techniques, such
as backdoor or membership inference attacks. As these techniques are not
formally designed for verification, they are easily bypassed when an
untrustworthy MLaaS undergoes rapid fine-tuning to merely meet the verification
conditions, rather than executing real unlearning. In this paper, we propose a
formal verification scheme, IndirectVerify, to determine whether unlearning
requests have been successfully executed. We design influential sample pairs:
one referred to as trigger samples and the other as reaction samples. Users
send unlearning requests regarding trigger samples and use reaction samples to
verify if the unlearning operation has been successfully carried out. We
propose a perturbation-based scheme to generate those influential sample pairs.
The objective is to perturb only a small fraction of trigger samples, leading
to the reclassification of reaction samples. This indirect influence will be
used for our verification purposes. In contrast to existing schemes that employ
the same samples for all processes, our scheme, IndirectVerify, provides
enhanced robustness, making it less susceptible to bypassing processes.