Prompt validation

A Survey of LLM-Driven AI Agent Communication: Protocols, Security Risks, and Defense Countermeasures

Authors: Dezhang Kong, Shi Lin, Zhenhua Xu, Zhebo Wang, Minghao Li, Yufeng Li, Yilun Zhang, Zeyang Sha, Yuyuan Li, Changting Lin, Xun Wang, Xuan Liu, Muhammad Khurram Khan, Ningyu Zhang, Chaochao Chen, Meng Han | Published: 2025-06-24
AIエージェント通信
Poisoning attack on RAG
Prompt validation

Adversarial Suffix Filtering: a Defense Pipeline for LLMs

Authors: David Khachaturov, Robert Mullins | Published: 2025-05-14
Prompt validation
倫理基準遵守
Attack Detection Method

Robustness via Referencing: Defending against Prompt Injection Attacks by Referencing the Executed Instruction

Authors: Yulin Chen, Haoran Li, Yuan Sui, Yue Liu, Yufei He, Yangqiu Song, Bryan Hooi | Published: 2025-04-29
Indirect Prompt Injection
Prompt validation
Attack Method

Watermarking Needs Input Repetition Masking

Authors: David Khachaturov, Robert Mullins, Ilia Shumailov, Sumanth Dathathri | Published: 2025-04-16
LLM Performance Evaluation
Prompt validation
Watermark Design

Benchmarking Practices in LLM-driven Offensive Security: Testbeds, Metrics, and Experiment Design

Authors: Andreas Happe, Jürgen Cito | Published: 2025-04-14
Testbed
Prompt validation
Progress Tracking

Can Indirect Prompt Injection Attacks Be Detected and Removed?

Authors: Yulin Chen, Haoran Li, Yuan Sui, Yufei He, Yue Liu, Yangqiu Song, Bryan Hooi | Published: 2025-02-23
Prompt validation
Malicious Prompt
Attack Method

Toxicity Detection for Free

Authors: Zhanhao Hu, Julien Piet, Geng Zhao, Jiantao Jiao, David Wagner | Published: 2024-05-29 | Updated: 2024-11-08
Indirect Prompt Injection
Prompt validation
Malicious Prompt

Large Language Model Sentinel: LLM Agent for Adversarial Purification

Authors: Guang Lin, Toshihisa Tanaka, Qibin Zhao | Published: 2024-05-24 | Updated: 2025-04-23
Prompt validation
Adversarial Text Purification
Defense Mechanism

Token-Level Adversarial Prompt Detection Based on Perplexity Measures and Contextual Information

Authors: Zhengmian Hu, Gang Wu, Saayan Mitra, Ruiyi Zhang, Tong Sun, Heng Huang, Viswanathan Swaminathan | Published: 2023-11-20 | Updated: 2024-02-18
Prompt Injection
Prompt validation
Robustness Evaluation

Fact-Checking Complex Claims with Program-Guided Reasoning

Authors: Liangming Pan, Xiaobao Wu, Xinyuan Lu, Anh Tuan Luu, William Yang Wang, Min-Yen Kan, Preslav Nakov | Published: 2023-05-22
Prompt validation
Detection of Misinformation
Real-World Fact-Checking