Scalable Defense against In-the-wild Jailbreaking Attacks with Safety Context Retrieval Authors: Taiye Chen, Zeming Wei, Ang Li, Yisen Wang | Published: 2025-05-21 2025.05.21 文献データベース
Alignment Under Pressure: The Case for Informed Adversaries When Evaluating LLM Defenses Authors: Xiaoxue Yang, Bozhidar Stevanoski, Matthieu Meeus, Yves-Alexandre de Montjoye | Published: 2025-05-21 2025.05.21 文献データベース
Silent Leaks: Implicit Knowledge Extraction Attack on RAG Systems through Benign Queries Authors: Yuhao Wang, Wenjie Qu, Yanze Jiang, Zichen Liu, Yue Liu, Shengfang Zhai, Yinpeng Dong, Jiaheng Zhang | Published: 2025-05-21 2025.05.21 文献データベース
あなたは騙されない?ディープフェイクがもたらす偽情報とその対策New AI技術を利用して、実在する人物の顔や音声を模倣して偽コンテンツなどを作成する技術であるディープフェイクやその対策について解説します。 2025.05.21 一般読者向け解説記事
Blind Spot Navigation: Evolutionary Discovery of Sensitive Semantic Concepts for LVLMs Authors: Zihao Pan, Yu Tong, Weibin Wu, Jingyi Wang, Lifeng Chen, Zhe Zhao, Jiajia Wei, Yitong Qiao, Zibin Zheng | Published: 2025-05-21 2025.05.21 文献データベース
Adaptive Plan-Execute Framework for Smart Contract Security Auditing Authors: Zhiyuan Wei, Jing Sun, Zijian Zhang, Zhe Hou, Zixiao Zhao | Published: 2025-05-21 2025.05.21 文献データベース
A Survey On Secure Machine Learning Authors: Taobo Liao, Taoran Li, Prathamesh Nadkarni | Published: 2025-05-21 2025.05.21 文献データベース
TSA-WF: Exploring the Effectiveness of Time Series Analysis for Website Fingerprinting Authors: Michael Wrana, Uzma Maroof, Diogo Barradas | Published: 2025-05-20 2025.05.20 文献データベース
sudoLLM : On Multi-role Alignment of Language Models Authors: Soumadeep Saha, Akshay Chaturvedi, Joy Mahapatra, Utpal Garain | Published: 2025-05-20 2025.05.20 文献データベース
Can Large Language Models Really Recognize Your Name? Authors: Dzung Pham, Peter Kairouz, Niloofar Mireshghallah, Eugene Bagdasarian, Chau Minh Pham, Amir Houmansadr | Published: 2025-05-20 2025.05.20 文献データベース