A Study of Backdoors in Instruction Fine-tuned Language Models Authors: Jayaram Raghuram, George Kesidis, David J. Miller | Published: 2024-06-12 | Updated: 2024-08-21 LLM SecurityBackdoor AttackDefense Method 2024.06.12 2025.05.27 Literature Database
A Survey of Recent Backdoor Attacks and Defenses in Large Language Models Authors: Shuai Zhao, Meihuizi Jia, Zhongliang Guo, Leilei Gan, Xiaoyu Xu, Xiaobao Wu, Jie Fu, Yichao Feng, Fengjun Pan, Luu Anh Tuan | Published: 2024-06-10 | Updated: 2025-01-04 LLM SecurityBackdoor Attack 2024.06.10 2025.05.27 Literature Database
An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong Detection Authors: Shenao Yan, Shen Wang, Yue Duan, Hanbin Hong, Kiho Lee, Doowon Kim, Yuan Hong | Published: 2024-06-10 LLM SecurityBackdoor AttackPrompt Injection 2024.06.10 2025.05.27 Literature Database
Lurking in the shadows: Unveiling Stealthy Backdoor Attacks against Personalized Federated Learning Authors: Xiaoting Lyu, Yufei Han, Wei Wang, Jingkai Liu, Yongsheng Zhu, Guangquan Xu, Jiqiang Liu, Xiangliang Zhang | Published: 2024-06-10 Backdoor AttackPoisoning 2024.06.10 2025.05.27 Literature Database
A Survey on Machine Unlearning: Techniques and New Emerged Privacy Risks Authors: Hengzhu Liu, Ping Xiong, Tianqing Zhu, Philip S. Yu | Published: 2024-06-10 Backdoor AttackPoisoningMembership Inference 2024.06.10 2025.05.27 Literature Database
Injecting Undetectable Backdoors in Obfuscated Neural Networks and Language Models Authors: Alkis Kalavasis, Amin Karbasi, Argyris Oikonomou, Katerina Sotiraki, Grigoris Velegkas, Manolis Zampetakis | Published: 2024-06-09 | Updated: 2024-09-07 WatermarkingBackdoor Attack 2024.06.09 2025.05.27 Literature Database
BadAgent: Inserting and Activating Backdoor Attacks in LLM Agents Authors: Yifei Wang, Dizhan Xue, Shengjie Zhang, Shengsheng Qian | Published: 2024-06-05 LLM SecurityBackdoor AttackPrompt Injection 2024.06.05 2025.05.27 Literature Database
No Vandalism: Privacy-Preserving and Byzantine-Robust Federated Learning Authors: Zhibo Xing, Zijian Zhang, Zi'ang Zhang, Jiamou Liu, Liehuang Zhu, Giovanni Russello | Published: 2024-06-03 WatermarkingBackdoor AttackPoisoning 2024.06.03 2025.05.27 Literature Database
PureGen: Universal Data Purification for Train-Time Poison Defense via Generative Model Dynamics Authors: Sunay Bhat, Jeffrey Jiang, Omead Pooladzandi, Alexander Branch, Gregory Pottie | Published: 2024-05-28 | Updated: 2024-06-02 WatermarkingBackdoor AttackPoisoning 2024.05.28 2025.05.27 Literature Database
Can We Trust Embodied Agents? Exploring Backdoor Attacks against Embodied LLM-based Decision-Making Systems Authors: Ruochen Jiao, Shaoyuan Xie, Justin Yue, Takami Sato, Lixu Wang, Yixuan Wang, Qi Alfred Chen, Qi Zhu | Published: 2024-05-27 | Updated: 2025-04-30 LLM SecurityBackdoor AttackPrompt Injection 2024.05.27 2025.05.27 Literature Database