Can Large Language Models Really Recognize Your Name? Authors: Dzung Pham, Peter Kairouz, Niloofar Mireshghallah, Eugene Bagdasarian, Chau Minh Pham, Amir Houmansadr | Published: 2025-05-20 LLM SecurityIndirect Prompt InjectionPrivacy Leakage 2025.05.20 2025.05.28 Literature Database
Is Your Prompt Safe? Investigating Prompt Injection Attacks Against Open-Source LLMs Authors: Jiawen Wang, Pritha Gupta, Ivan Habernal, Eyke Hüllermeier | Published: 2025-05-20 LLM SecurityDisabling Safety Mechanisms of LLMPrompt Injection 2025.05.20 2025.05.28 Literature Database
Exploring Jailbreak Attacks on LLMs through Intent Concealment and Diversion Authors: Tiehan Cui, Yanxu Mao, Peipei Liu, Congying Liu, Datao You | Published: 2025-05-20 LLM SecurityDisabling Safety Mechanisms of LLMPrompt Injection 2025.05.20 2025.05.28 Literature Database
Adversarially Pretrained Transformers may be Universally Robust In-Context Learners Authors: Soichiro Kumano, Hiroshi Kera, Toshihiko Yamasaki | Published: 2025-05-20 Certified RobustnessRelationship between Robustness and PrivacyAdversarial Learning 2025.05.20 2025.05.28 Literature Database
PandaGuard: Systematic Evaluation of LLM Safety against Jailbreaking Attacks Authors: Guobin Shen, Dongcheng Zhao, Linghao Feng, Xiang He, Jihang Wang, Sicheng Shen, Haibo Tong, Yiting Dong, Jindong Li, Xiang Zheng, Yi Zeng | Published: 2025-05-20 | Updated: 2025-05-22 Disabling Safety Mechanisms of LLMPrompt InjectionEffectiveness Analysis of Defense Methods 2025.05.20 2025.05.28 Literature Database
Fragments to Facts: Partial-Information Fragment Inference from LLMs Authors: Lucas Rosenblatt, Bin Han, Robert Wolfe, Bill Howe | Published: 2025-05-20 Privacy LeakagePrompt leakingThreats of Medical AI 2025.05.20 2025.05.28 Literature Database
FlowPure: Continuous Normalizing Flows for Adversarial Purification Authors: Elias Collaert, Abel Rodríguez, Sander Joos, Lieven Desmet, Vera Rimmer | Published: 2025-05-19 Robustness Improvement MethodAdversarial LearningEffectiveness Analysis of Defense Methods 2025.05.19 2025.05.28 Literature Database
Fixing 7,400 Bugs for 1$: Cheap Crash-Site Program Repair Authors: Han Zheng, Ilia Shumailov, Tianqi Fan, Aiden Hall, Mathias Payer | Published: 2025-05-19 LLM Securityバグ修正手法Watermarking Technology 2025.05.19 2025.05.28 Literature Database
The Hidden Dangers of Browsing AI Agents Authors: Mykyta Mudryi, Markiyan Chaklosh, Grzegorz Wójcik | Published: 2025-05-19 LLM SecurityIndirect Prompt InjectionAttack Method 2025.05.19 2025.05.28 Literature Database
Evaluating the efficacy of LLM Safety Solutions : The Palit Benchmark Dataset Authors: Sayon Palit, Daniel Woods | Published: 2025-05-19 | Updated: 2025-05-20 LLM SecurityPrompt InjectionAttack Method 2025.05.19 2025.05.28 Literature Database