Large Language Models (LLMs), which bridge the gap between human language
understanding and complex problem-solving, achieve state-of-the-art performance
on several NLP tasks, particularly in few-shot and zero-shot settings. Despite
the demonstrable efficacy of LLMs, due to constraints on computational
resources, users have to engage with open-source language models or outsource
the entire training process to third-party platforms. However, research has
demonstrated that language models are susceptible to potential security
vulnerabilities, particularly in backdoor attacks. Backdoor attacks are
designed to introduce targeted vulnerabilities into language models by
poisoning training samples or model weights, allowing attackers to manipulate
model responses through malicious triggers. While existing surveys on backdoor
attacks provide a comprehensive overview, they lack an in-depth examination of
backdoor attacks specifically targeting LLMs. To bridge this gap and grasp the
latest trends in the field, this paper presents a novel perspective on backdoor
attacks for LLMs by focusing on fine-tuning methods. Specifically, we
systematically classify backdoor attacks into three categories: full-parameter
fine-tuning, parameter-efficient fine-tuning, and no fine-tuning Based on
insights from a substantial review, we also discuss crucial issues for future
research on backdoor attacks, such as further exploring attack algorithms that
do not require fine-tuning, or developing more covert attack algorithms.