Prompt leaking

Large Language Models Are Effective Code Watermarkers

Authors: Rui Xu, Jiawei Chen, Zhaoxia Yin, Cong Kong, Xinpeng Zhang | Published: 2025-10-13
Prompt leaking
Robustness
Digital Watermarking for Generative AI

TypePilot: Leveraging the Scala Type System for Secure LLM-generated Code

Authors: Alexander Sternfeld, Andrei Kucharavy, Ljiljana Dolamic | Published: 2025-10-13
Indirect Prompt Injection
Security Analysis Method
Prompt leaking

Rethinking Reasoning: A Survey on Reasoning-based Backdoors in LLMs

Authors: Man Hu, Xinyi Wu, Zuofeng Suo, Jinbo Feng, Linghui Meng, Yanhao Jia, Anh Tuan Luu, Shuai Zhao | Published: 2025-10-09
Prompt leaking
推論に基づくバックドア攻撃
Defense Method

Untargeted Jailbreak Attack

Authors: Xinzhe Huang, Wenjing Hu, Tianhang Zheng, Kedong Xiu, Xiaojun Jia, Di Wang, Zhan Qin, Kui Ren | Published: 2025-10-03 | Updated: 2025-10-28
Prompt Injection
Prompt leaking
Effectiveness Analysis of Defense Methods

Fine-Tuning Jailbreaks under Highly Constrained Black-Box Settings: A Three-Pronged Approach

Authors: Xiangfang Li, Yu Wang, Bo Li | Published: 2025-10-01 | Updated: 2025-10-09
Indirect Prompt Injection
Prompt leaking
Defense Mechanism

MaskSQL: Safeguarding Privacy for LLM-Based Text-to-SQL via Abstraction

Authors: Sepideh Abedini, Shubhankar Mohapatra, D. B. Emerson, Masoumeh Shafieinejad, Jesse C. Cresswell, Xi He | Published: 2025-09-27 | Updated: 2025-09-30
SQLクエリ生成
Prompt Injection
Prompt leaking

Enterprise AI Must Enforce Participant-Aware Access Control

Authors: Shashank Shreedhar Bhatt, Tanmay Rajore, Khushboo Aggarwal, Ganesh Ananthanarayanan, Ranveer Chandra, Nishanth Chandran, Suyash Choudhury, Divya Gupta, Emre Kiciman, Sumit Kumar Pandey, Srinath Setty, Rahul Sharma, Teijia Zhao | Published: 2025-09-18
Security Analysis
Privacy Management
Prompt leaking

Yet Another Watermark for Large Language Models

Authors: Siyuan Bao, Ying Shi, Zhiguang Yang, Hanzhou Wu, Xinpeng Zhang | Published: 2025-09-16
Prompt leaking
Large Language Model
Watermarking Technology

PromptCOS: Towards System Prompt Copyright Auditing for LLMs via Content-level Output Similarity

Authors: Yuchen Yang, Yiming Li, Hongwei Yao, Enhao Huang, Shuo Shao, Bingrun Yang, Zhibo Wang, Dacheng Tao, Zhan Qin | Published: 2025-09-03
Prompt validation
Prompt leaking
Model Extraction Attack

The Double-edged Sword of LLM-based Data Reconstruction: Understanding and Mitigating Contextual Vulnerability in Word-level Differential Privacy Text Sanitization

Authors: Stephen Meisenbacher, Alexandra Klymenko, Andreea-Elena Bodea, Florian Matthes | Published: 2025-08-26
Prompt leaking
Differential Privacy
文書プライバシー