PromptCOS: Towards System Prompt Copyright Auditing for LLMs via Content-level Output Similarity Authors: Yuchen Yang, Yiming Li, Hongwei Yao, Enhao Huang, Shuo Shao, Bingrun Yang, Zhibo Wang, Dacheng Tao, Zhan Qin | Published: 2025-09-03 Prompt validationPrompt leakingModel Extraction Attack 2025.09.03 2025.09.05 Literature Database
EverTracer: Hunting Stolen Large Language Models via Stealthy and Robust Probabilistic Fingerprint Authors: Zhenhua Xu, Meng Han, Wenpeng Xing | Published: 2025-09-03 Disabling Safety Mechanisms of LLMData Protection MethodPrompt validation 2025.09.03 2025.09.05 Literature Database
PromptSleuth: Detecting Prompt Injection via Semantic Intent Invariance Authors: Mengxiao Wang, Yuxuan Zhang, Guofei Gu | Published: 2025-08-28 Indirect Prompt InjectionPrompt InjectionPrompt validation 2025.08.28 2025.09.01 Literature Database
Attacking interpretable NLP systems Authors: Eldor Abdukhamidov, Tamer Abuhmed, Joanna C. S. Santos, Mohammed Abuhamad | Published: 2025-07-22 Prompt InjectionPrompt validationAdversarial Attack Methods 2025.07.22 2025.07.24 Literature Database
GuardVal: Dynamic Large Language Model Jailbreak Evaluation for Comprehensive Safety Testing Authors: Peiyan Zhang, Haibo Jin, Liying Kang, Haohan Wang | Published: 2025-07-10 Prompt validationLarge Language ModelPerformance Evaluation Metrics 2025.07.10 2025.07.12 Literature Database
PenTest2.0: Towards Autonomous Privilege Escalation Using GenAI Authors: Haitham S. Al-Sinani, Chris J. Mitchell | Published: 2025-07-09 Indirect Prompt InjectionPrompt validationPrompt leaking 2025.07.09 2025.07.11 Literature Database
A Survey of LLM-Driven AI Agent Communication: Protocols, Security Risks, and Defense Countermeasures Authors: Dezhang Kong, Shi Lin, Zhenhua Xu, Zhebo Wang, Minghao Li, Yufeng Li, Yilun Zhang, Zeyang Sha, Yuyuan Li, Changting Lin, Xun Wang, Xuan Liu, Muhammad Khurram Khan, Ningyu Zhang, Chaochao Chen, Meng Han | Published: 2025-06-24 AIエージェント通信Poisoning attack on RAGPrompt validation 2025.06.24 2025.06.26 Literature Database
Adversarial Suffix Filtering: a Defense Pipeline for LLMs Authors: David Khachaturov, Robert Mullins | Published: 2025-05-14 Prompt validation倫理基準遵守Attack Detection Method 2025.05.14 2025.05.28 Literature Database
Robustness via Referencing: Defending against Prompt Injection Attacks by Referencing the Executed Instruction Authors: Yulin Chen, Haoran Li, Yuan Sui, Yue Liu, Yufei He, Yangqiu Song, Bryan Hooi | Published: 2025-04-29 Indirect Prompt InjectionPrompt validationAttack Method 2025.04.29 2025.05.27 Literature Database
Watermarking Needs Input Repetition Masking Authors: David Khachaturov, Robert Mullins, Ilia Shumailov, Sumanth Dathathri | Published: 2025-04-16 LLM Performance EvaluationPrompt validationWatermark Design 2025.04.16 2025.05.27 Literature Database