A Red Teaming Framework for Large Language Models: A Case Study on Faithfulness Evaluation Authors: Abrar Alotaibi, Raed Mughus, Moataz Ahmed | Published: 2026-06-24 2026.06.24 2026.06.26 Literature Database
Representation Matters: An Empirical Study of Program Representations for LLM Vulnerability Reasoning Authors: Andrew Stoltman, Johnathan Tang, Haipeng Cai | Published: 2026-06-24 2026.06.24 2026.06.26 Literature Database
Decoupling Reconnaissance and Exploitation: Measuring the Capability Boundaries of LLM-Based Web Penetration Testing Authors: Liwei Yu, Shuo Li, Ming Zhou, Ge Chu, Yan Guo | Published: 2026-06-24 2026.06.24 2026.06.26 Literature Database
Privacy-Preserving RAG via Multi-Agent Semantic Rewriting: Achieving Confidentiality Without Compromising Contextual Fidelity Authors: Yuanhe Zhao, Tianyu Zhang, Huafei Xing, Derek F. Wong, Jianbin Li, Tao Fang | Published: 2026-06-23 2026.06.23 2026.06.25 Literature Database
Themis: An explainable AI-enabled framework for Reinforcement Learning with Human Feedback Authors: Andreas Chouliaras, Luke Connolly, Dimitris Chatzpoulos | Published: 2026-06-23 2026.06.23 2026.06.25 Literature Database
AdversaBench: Automated LLM Red-Teaming with Multi-Judge Confirmation and Cross-Model Transferability Authors: Khanak Khandelwal | Published: 2026-06-23 2026.06.23 2026.06.25 Literature Database
LLMs Prompted for Legal Context Object More: Overrefusal from Small On-Premises LLMs in Criminal Legal Context Authors: Anastasiia Kucherenko, François Brouchoud, Dimitri Percia David, Andrei Kucharavy | Published: 2026-06-23 2026.06.23 2026.06.25 Literature Database
Poster: Exploring the Limits of Audio-Based Detection of Turkish Phone Call Scams Authors: Arda Eren, Micheal Cheung, Youqian Zhang, Grace Ngai, Eugene Yujun Fu | Published: 2026-06-23 2026.06.23 2026.06.25 Literature Database
Red-Teaming the Agentic Red-Team Authors: Dario Pasquini, Michal Bazyli, Taras Fedynyshyn, Artem Sorokin | Published: 2026-06-23 2026.06.23 2026.06.25 Literature Database
PHANTOM: A Large-Scale Dataset of Multimodal Adversarial Attacks for Vision-Language Models Authors: Simone Gallivanone, Hossein Khodadadi, Mauro Dore, Mauro Medda, Nicola Franco | Published: 2026-06-23 2026.06.23 2026.06.25 Literature Database