These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Label corruption, where training samples are mislabeled due to non-expert
annotation or adversarial attacks, significantly degrades model performance.
Acquiring large, perfectly labeled datasets is costly, and retraining models
from scratch is computationally expensive. To address this, we introduce Scaled
Activation Projection (SAP), a novel SVD (Singular Value Decomposition)-based
corrective machine unlearning algorithm. SAP mitigates label noise by
identifying a small subset of trusted samples using cross-entropy loss and
projecting model weights onto a clean activation space estimated using SVD on
these trusted samples. This process suppresses the noise introduced in
activations due to the mislabeled samples. In our experiments, we demonstrate
SAP's effectiveness on synthetic noise with different settings and real-world
label noise. SAP applied to the CIFAR dataset with 25% synthetic corruption
show upto 6% generalization improvements. Additionally, SAP can improve the
generalization over noise robust training approaches on CIFAR dataset by ~3.2%
on average. Further, we observe generalization improvements of 2.31% for a
Vision Transformer model trained on naturally corrupted Clothing1M.