These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Low rank adaptation (LoRA) has emerged as a prominent technique for
fine-tuning large language models (LLMs) thanks to its superb efficiency gains
over previous methods. While extensive studies have examined the performance
and structural properties of LoRA, its behavior upon training-time attacks
remain underexplored, posing significant security risks. In this paper, we
theoretically investigate the security implications of LoRA's low-rank
structure during fine-tuning, in the context of its robustness against data
poisoning and backdoor attacks. We propose an analytical framework that models
LoRA's training dynamics, employs the neural tangent kernel to simplify the
analysis of the training process, and applies information theory to establish
connections between LoRA's low rank structure and its vulnerability against
training-time attacks. Our analysis indicates that LoRA exhibits better
robustness to backdoor attacks than full fine-tuning, while becomes more
vulnerable to untargeted data poisoning due to its over-simplified information
geometry. Extensive experimental evaluations have corroborated our theoretical
findings.