Backdoor Attack

Unraveling Attacks in Machine Learning-based IoT Ecosystems: A Survey and the Open Libraries Behind Them

Authors: Chao Liu, Boxi Chen, Wei Shao, Chris Zhang, Kelvin Wong, Yi Zhang | Published: 2024-01-22 | Updated: 2024-01-27
Backdoor Attack
Privacy Protection Method
Membership Inference

BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models

Authors: Zhen Xiang, Fengqing Jiang, Zidi Xiong, Bhaskar Ramasubramanian, Radha Poovendran, Bo Li | Published: 2024-01-20
LLM Performance Evaluation
Backdoor Attack
Prompt Injection

Universal Vulnerabilities in Large Language Models: Backdoor Attacks for In-context Learning

Authors: Shuai Zhao, Meihuizi Jia, Luu Anh Tuan, Fengjun Pan, Jinming Wen | Published: 2024-01-11 | Updated: 2024-10-09
Backdoor Attack
Prompt Injection

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

Authors: Evan Hubinger, Carson Denison, Jesse Mu, Mike Lambert, Meg Tong, Monte MacDiarmid, Tamera Lanham, Daniel M. Ziegler, Tim Maxwell, Newton Cheng, Adam Jermyn, Amanda Askell, Ansh Radhakrishnan, Cem Anil, David Duvenaud, Deep Ganguli, Fazl Barez, Jack Clark, Kamal Ndousse, Kshitij Sachan, Michael Sellitto, Mrinank Sharma, Nova DasSarma, Roger Grosse, Shauna Kravec, Yuntao Bai, Zachary Witten, Marina Favaro, Jan Brauner, Holden Karnofsky, Paul Christiano, Samuel R. Bowman, Logan Graham, Jared Kaplan, Sören Mindermann, Ryan Greenblatt, Buck Shlegeris, Nicholas Schiefer, Ethan Perez | Published: 2024-01-10 | Updated: 2024-01-17
Backdoor Attack
Prompt Injection
Reinforcement Learning

MalModel: Hiding Malicious Payload in Mobile Deep Learning Models with Black-box Backdoor Attack

Authors: Jiayi Hua, Kailong Wang, Meizhen Wang, Guangdong Bai, Xiapu Luo, Haoyu Wang | Published: 2024-01-05
Backdoor Attack
Malware Classification
Model Performance Evaluation

FedBayes: A Zero-Trust Federated Learning Aggregation to Defend Against Adversarial Attacks

Authors: Marc Vucovich, Devin Quinn, Kevin Choi, Christopher Redino, Abdul Rahman, Edward Bowen | Published: 2023-12-04
Backdoor Attack
Malicious Client
Federated Learning

Understanding Variation in Subpopulation Susceptibility to Poisoning Attacks

Authors: Evan Rose, Fnu Suya, David Evans | Published: 2023-11-20
Subpopulation Characteristics
Backdoor Attack
Poisoning Attack

TextGuard: Provable Defense against Backdoor Attacks on Text Classification

Authors: Hengzhi Pei, Jinyuan Jia, Wenbo Guo, Bo Li, Dawn Song | Published: 2023-11-19 | Updated: 2023-11-25
Text Generation Method
Backdoor Attack
Poisoning

Stealthy and Persistent Unalignment on Large Language Models via Backdoor Injections

Authors: Yuanpu Cao, Bochuan Cao, Jinghui Chen | Published: 2023-11-15 | Updated: 2024-06-09
Backdoor Attack
Prompt Injection

Label Poisoning is All You Need

Authors: Rishi D. Jha, Jonathan Hayase, Sewoong Oh | Published: 2023-10-29
Security Analysis
Backdoor Attack
Classification of Malicious Actors