These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Deleting data from a trained machine learning (ML) model is a critical task
in many applications. For example, we may want to remove the influence of
training points that might be out of date or outliers. Regulations such as EU's
General Data Protection Regulation also stipulate that individuals can request
to have their data deleted. The naive approach to data deletion is to retrain
the ML model on the remaining data, but this is too time consuming. In this
work, we propose a new approximate deletion method for linear and logistic
models whose computational cost is linear in the the feature dimension $d$ and
independent of the number of training data $n$. This is a significant gain over
all existing methods, which all have superlinear time dependence on the
dimension. We also develop a new feature-injection test to evaluate the
thoroughness of data deletion from ML models.