Character-Level Perturbations Disrupt LLM Watermarks Authors: Zhaoxi Zhang, Xiaomei Zhang, Yanjun Zhang, He Zhang, Shirui Pan, Bo Liu, Asif Qumer Gill, Leo Yu Zhang | Published: 2025-09-11 Attack MethodDigital Watermarking for Generative AIWatermark Design 2025.09.11 2025.09.13 Literature Database
Not All Samples Are Equal: Quantifying Instance-level Difficulty in Targeted Data Poisoning Authors: William Xu, Yiwei Lu, Yihan Wang, Matthew Y. R. Yang, Zuoqiu Liu, Gautam Kamath, Yaoliang Yu | Published: 2025-09-08 Poisoningポイズニング難易度Attack Method 2025.09.08 2025.09.10 Literature Database
Imitative Membership Inference Attack Authors: Yuntao Du, Yuetian Chen, Hanshen Xiao, Bruno Ribeiro, Ninghui Li | Published: 2025-09-08 Experimental ValidationAttack MethodAdversarial Learning 2025.09.08 2025.09.10 Literature Database
When LLMs Copy to Think: Uncovering Copy-Guided Attacks in Reasoning LLMs Authors: Yue Li, Xiao Li, Hao Wu, Yue Zhang, Fengyuan Xu, Xiuzhen Cheng, Sheng Zhong | Published: 2025-07-22 Prompt leakingModel DoSAttack Method 2025.07.22 2025.07.24 Literature Database
From Text to Actionable Intelligence: Automating STIX Entity and Relationship Extraction Authors: Ahmed Lekssays, Husrev Taha Sencar, Ting Yu | Published: 2025-07-22 Indirect Prompt InjectionAttack MethodThreat modeling 2025.07.22 2025.07.24 Literature Database
Depth Gives a False Sense of Privacy: LLM Internal States Inversion Authors: Tian Dong, Yan Meng, Shaofeng Li, Guoxing Chen, Zhen Liu, Haojin Zhu | Published: 2025-07-22 Prompt InjectionPrompt leakingAttack Method 2025.07.22 2025.07.24 Literature Database
The Hidden Dangers of Browsing AI Agents Authors: Mykyta Mudryi, Markiyan Chaklosh, Grzegorz Wójcik | Published: 2025-05-19 LLM SecurityIndirect Prompt InjectionAttack Method 2025.05.19 2025.05.28 Literature Database
Evaluating the efficacy of LLM Safety Solutions : The Palit Benchmark Dataset Authors: Sayon Palit, Daniel Woods | Published: 2025-05-19 | Updated: 2025-05-20 LLM SecurityPrompt InjectionAttack Method 2025.05.19 2025.05.28 Literature Database
From Assistants to Adversaries: Exploring the Security Risks of Mobile LLM Agents Authors: Liangxuan Wu, Chao Wang, Tianming Liu, Yanjie Zhao, Haoyu Wang | Published: 2025-05-19 | Updated: 2025-05-20 LLM SecurityIndirect Prompt InjectionAttack Method 2025.05.19 2025.05.28 Literature Database
Revealing Weaknesses in Text Watermarking Through Self-Information Rewrite Attacks Authors: Yixin Cheng, Hongcheng Guo, Yangming Li, Leonid Sigal | Published: 2025-05-08 Prompt leakingAttack MethodWatermarking Technology 2025.05.08 2025.05.12 Literature Database