Does Low Rank Adaptation Lead to Lower Robustness against Training-Time Attacks? | AI Security Portal

JA

JA

EN

TOP Literature Database Does Low Rank Adaptation Lead to Lower Robustness against Training-Time Attacks?

arxiv

Does Low Rank Adaptation Lead to Lower Robustness against Training-Time Attacks?

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2505.12871

PDF

https://arxiv.org/pdf/2505.12871

Paper Information

Author: Zi Liang,Haibo Hu,Qingqing Ye,Yaxin Xiao,Ronghua Li
Published: 5-19-2025
Affiliation: The Hong Kong Polytechnic University
Country: Hong Kong, China
Conference: International Conference on Machine Learning (ICML)

Labels Estimated by AI

Poisoning Attack robustness requirements LLM Security

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Low rank adaptation (LoRA) has emerged as a prominent technique for fine-tuning large language models (LLMs) thanks to its superb efficiency gains over previous methods. While extensive studies have examined the performance and structural properties of LoRA, its behavior upon training-time attacks remain underexplored, posing significant security risks. In this paper, we theoretically investigate the security implications of LoRA's low-rank structure during fine-tuning, in the context of its robustness against data poisoning and backdoor attacks. We propose an analytical framework that models LoRA's training dynamics, employs the neural tangent kernel to simplify the analysis of the training process, and applies information theory to establish connections between LoRA's low rank structure and its vulnerability against training-time attacks. Our analysis indicates that LoRA exhibits better robustness to backdoor attacks than full fine-tuning, while becomes more vulnerable to untargeted data poisoning due to its over-simplified information geometry. Extensive experimental evaluations have corroborated our theoretical findings.

External Datasets

SST-2

QNLI

QQP

Alpaca