Bypassing Prompt Injection and Jailbreak Detection in LLM Guardrails Authors: William Hackett, Lewis Birch, Stefan Trawicki, Neeraj Suri, Peter Garraghan | Published: 2025-04-15 2025.04.15 文献データベース
Benchmarking Practices in LLM-driven Offensive Security: Testbeds, Metrics, and Experiment Design Authors: Andreas Happe, Jürgen Cito | Published: 2025-04-14 2025.04.14 文献データベース
Do We Really Need Curated Malicious Data for Safety Alignment in Multi-modal Large Language Models? Authors: Yanbo Wang, Jiyang Guan, Jian Liang, Ran He | Published: 2025-04-14 2025.04.14 文献データベース
StruPhantom: Evolutionary Injection Attacks on Black-Box Tabular Agents Powered by Large Language Models Authors: Yang Feng, Xudong Pan | Published: 2025-04-14 2025.04.14 文献データベース
An Investigation of Large Language Models and Their Vulnerabilities in Spam Detection Authors: Qiyao Tang, Xiangyang Li | Published: 2025-04-14 2025.04.14 文献データベース
ControlNET: A Firewall for RAG-based LLM System Authors: Hongwei Yao, Haoran Shi, Yidou Chen, Yixin Jiang, Cong Wang, Zhan Qin | Published: 2025-04-13 | Updated: 2025-04-17 2025.04.13 文献データベース
CheatAgent: Attacking LLM-Empowered Recommender Systems via LLM Agent Authors: Liang-bo Ning, Shijie Wang, Wenqi Fan, Qing Li, Xin Xu, Hao Chen, Feiran Huang | Published: 2025-04-13 | Updated: 2025-04-24 2025.04.13 文献データベース
On the Practice of Deep Hierarchical Ensemble Network for Ad Conversion Rate Prediction Authors: Jinfeng Zhuang, Yinrui Li, Runze Su, Ke Xu, Zhixuan Shao, Kungang Li, Ling Leng, Han Sun, Meng Qi, Yixiong Meng, Yang Tang, Zhifang Liu, Qifei Shen, Aayush Mudgal, Caleb Lu, Jie Liu, Hongda Shen | Published: 2025-04-10 | Updated: 2025-04-23 2025.04.10 文献データベース
PR-Attack: Coordinated Prompt-RAG Attacks on Retrieval-Augmented Generation in Large Language Models via Bilevel Optimization Authors: Yang Jiao, Xiaodong Wang, Kai Yang | Published: 2025-04-10 2025.04.10 文献データベース
LLM-IFT: LLM-Powered Information Flow Tracking for Secure Hardware Authors: Nowfel Mashnoor, Mohammad Akyash, Hadi Kamali, Kimia Azar | Published: 2025-04-09 2025.04.09 文献データベース