Who Speaks for the Trigger? Dynamic Expert Routing in Backdoored Mixture-of-Experts Transformers Authors: Xin Zhao, Xiaojun Chen, Bingshan Liu, Haoyu Gao, Zhendong Zhao, Yilong Chen | Published: 2025-10-15 Backdoor DetectionPrompt leakingLarge Language Model 2025.10.15 2025.10.17 Literature Database
Evaluating and Mitigating LLM-as-a-judge Bias in Communication Systems Authors: Jiaxin Gao, Chen Chen, Yanwen Jia, Xueluan Gong, Kwok-Yan Lam, Qian Wang | Published: 2025-10-14 BiasPrompt leakingLarge Language Model 2025.10.14 2025.10.16 Literature Database
Large Language Models Are Effective Code Watermarkers Authors: Rui Xu, Jiawei Chen, Zhaoxia Yin, Cong Kong, Xinpeng Zhang | Published: 2025-10-13 Prompt leakingRobustnessDigital Watermarking for Generative AI 2025.10.13 2025.10.15 Literature Database
TypePilot: Leveraging the Scala Type System for Secure LLM-generated Code Authors: Alexander Sternfeld, Andrei Kucharavy, Ljiljana Dolamic | Published: 2025-10-13 Indirect Prompt InjectionSecurity Analysis MethodPrompt leaking 2025.10.13 2025.10.15 Literature Database
Rethinking Reasoning: A Survey on Reasoning-based Backdoors in LLMs Authors: Man Hu, Xinyi Wu, Zuofeng Suo, Jinbo Feng, Linghui Meng, Yanhao Jia, Anh Tuan Luu, Shuai Zhao | Published: 2025-10-09 Prompt leaking推論に基づくバックドア攻撃Defense Method 2025.10.09 2025.10.11 Literature Database
Untargeted Jailbreak Attack Authors: Xinzhe Huang, Wenjing Hu, Tianhang Zheng, Kedong Xiu, Xiaojun Jia, Di Wang, Zhan Qin, Kui Ren | Published: 2025-10-03 | Updated: 2025-10-28 Prompt InjectionPrompt leakingEffectiveness Analysis of Defense Methods 2025.10.03 2025.10.30 Literature Database
Fine-Tuning Jailbreaks under Highly Constrained Black-Box Settings: A Three-Pronged Approach Authors: Xiangfang Li, Yu Wang, Bo Li | Published: 2025-10-01 | Updated: 2025-10-09 Indirect Prompt InjectionPrompt leakingDefense Mechanism 2025.10.01 2025.10.11 Literature Database
MaskSQL: Safeguarding Privacy for LLM-Based Text-to-SQL via Abstraction Authors: Sepideh Abedini, Shubhankar Mohapatra, D. B. Emerson, Masoumeh Shafieinejad, Jesse C. Cresswell, Xi He | Published: 2025-09-27 | Updated: 2025-09-30 SQLクエリ生成Prompt InjectionPrompt leaking 2025.09.27 2025.10.02 Literature Database
Enterprise AI Must Enforce Participant-Aware Access Control Authors: Shashank Shreedhar Bhatt, Tanmay Rajore, Khushboo Aggarwal, Ganesh Ananthanarayanan, Ranveer Chandra, Nishanth Chandran, Suyash Choudhury, Divya Gupta, Emre Kiciman, Sumit Kumar Pandey, Srinath Setty, Rahul Sharma, Teijia Zhao | Published: 2025-09-18 Security AnalysisPrivacy ManagementPrompt leaking 2025.09.18 2025.09.20 Literature Database
Yet Another Watermark for Large Language Models Authors: Siyuan Bao, Ying Shi, Zhiguang Yang, Hanzhou Wu, Xinpeng Zhang | Published: 2025-09-16 Prompt leakingLarge Language ModelWatermarking Technology 2025.09.16 2025.09.18 Literature Database