These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Influence functions serve as crucial tools for assessing sample influence in
model interpretation, subset training set selection, noisy label detection, and
more. By employing the first-order Taylor extension, influence functions can
estimate sample influence without the need for expensive model retraining.
However, applying influence functions directly to deep models presents
challenges, primarily due to the non-convex nature of the loss function and the
large size of model parameters. This difficulty not only makes computing the
inverse of the Hessian matrix costly but also renders it non-existent in some
cases. Various approaches, including matrix decomposition, have been explored
to expedite and approximate the inversion of the Hessian matrix, with the aim
of making influence functions applicable to deep models. In this paper, we
revisit a specific, albeit naive, yet effective approximation method known as
TracIn. This method substitutes the inverse of the Hessian matrix with an
identity matrix. We provide deeper insights into why this simple approximation
method performs well. Furthermore, we extend its applications beyond measuring
model utility to include considerations of fairness and robustness. Finally, we
enhance TracIn through an ensemble strategy. To validate its effectiveness, we
conduct experiments on synthetic data and extensive evaluations on noisy label
detection, sample selection for large language model fine-tuning, and defense
against adversarial attacks.