R1dacted: Investigating Local Censorship in DeepSeek’s R1 Language Model Authors: Ali Naseh, Harsh Chaudhari, Jaechul Roh, Mingshi Wu, Alina Oprea, Amir Houmansadr | Published: 2025-05-19 Bias Detection in AI OutputPrompt leaking検閲行動 2025.05.19 2025.05.28 Literature Database
Cutting Through Privacy: A Hyperplane-Based Data Reconstruction Attack in Federated Learning Authors: Francesco Diana, André Nusser, Chuan Xu, Giovanni Neglia | Published: 2025-05-15 Prompt leakingModel Extraction AttackExploratory Attack 2025.05.15 2025.05.28 Literature Database
Instantiating Standards: Enabling Standard-Driven Text TTP Extraction with Evolvable Memory Authors: Cheng Meng, ZhengWei Jiang, QiuYun Wang, XinYi Li, ChunYan Ma, FangMing Dong, FangLi Ren, BaoXu Liu | Published: 2025-05-14 Prompt leakingAttack Detection MethodKnowledge Extraction Method 2025.05.14 2025.05.28 Literature Database
SecReEvalBench: A Multi-turned Security Resilience Evaluation Benchmark for Large Language Models Authors: Huining Cui, Wei Liu | Published: 2025-05-12 LLM SecurityPrompt InjectionPrompt leaking 2025.05.12 2025.05.28 Literature Database
I Know What You Said: Unveiling Hardware Cache Side-Channels in Local Large Language Model Inference Authors: Zibo Gao, Junjie Hu, Feng Guo, Yixin Zhang, Yinglong Han, Siyuan Liu, Haiyang Li, Zhiqiang Lv | Published: 2025-05-10 | Updated: 2025-05-14 Disabling Safety Mechanisms of LLMPrompt leakingAttack Detection Method 2025.05.10 2025.05.28 Literature Database
LLM-Text Watermarking based on Lagrange Interpolation Authors: Jarosław Janas, Paweł Morawiecki, Josef Pieprzyk | Published: 2025-05-09 | Updated: 2025-05-13 LLM SecurityPrompt leakingDigital Watermarking for Generative AI 2025.05.09 2025.05.28 Literature Database
Revealing Weaknesses in Text Watermarking Through Self-Information Rewrite Attacks Authors: Yixin Cheng, Hongcheng Guo, Yangming Li, Leonid Sigal | Published: 2025-05-08 Prompt leakingAttack MethodWatermarking Technology 2025.05.08 2025.05.12 Literature Database
Towards Effective Identification of Attack Techniques in Cyber Threat Intelligence Reports using Large Language Models Authors: Hoang Cuong Nguyen, Shahroz Tariq, Mohan Baruwal Chhetri, Bao Quoc Vo | Published: 2025-05-06 Prompt leakingAttack TypeTaxonomy of Attacks 2025.05.06 2025.05.27 Literature Database
Unveiling the Landscape of LLM Deployment in the Wild: An Empirical Study Authors: Xinyi Hou, Jiahao Han, Yanjie Zhao, Haoyu Wang | Published: 2025-05-05 API SecurityIndirect Prompt InjectionPrompt leaking 2025.05.05 2025.05.27 Literature Database
An Empirical Study on the Effectiveness of Large Language Models for Binary Code Understanding Authors: Xiuwei Shang, Zhenkan Fu, Shaoyin Cheng, Guoqiang Chen, Gangyang Li, Li Hu, Weiming Zhang, Nenghai Yu | Published: 2025-04-30 Program AnalysisPrompt InjectionPrompt leaking 2025.04.30 2025.05.27 Literature Database