A Red Teaming Framework for Large Language Models: A Case Study on Faithfulness Evaluation

Authors: Abrar Alotaibi, Raed Mughus, Moataz Ahmed | Published: 2026-06-24

Representation Matters: An Empirical Study of Program Representations for LLM Vulnerability Reasoning

Authors: Andrew Stoltman, Johnathan Tang, Haipeng Cai | Published: 2026-06-24

Decoupling Reconnaissance and Exploitation: Measuring the Capability Boundaries of LLM-Based Web Penetration Testing

Authors: Liwei Yu, Shuo Li, Ming Zhou, Ge Chu, Yan Guo | Published: 2026-06-24

Privacy-Preserving RAG via Multi-Agent Semantic Rewriting: Achieving Confidentiality Without Compromising Contextual Fidelity

Authors: Yuanhe Zhao, Tianyu Zhang, Huafei Xing, Derek F. Wong, Jianbin Li, Tao Fang | Published: 2026-06-23

Themis: An explainable AI-enabled framework for Reinforcement Learning with Human Feedback

Authors: Andreas Chouliaras, Luke Connolly, Dimitris Chatzpoulos | Published: 2026-06-23

AdversaBench: Automated LLM Red-Teaming with Multi-Judge Confirmation and Cross-Model Transferability

Authors: Khanak Khandelwal | Published: 2026-06-23

LLMs Prompted for Legal Context Object More: Overrefusal from Small On-Premises LLMs in Criminal Legal Context

Authors: Anastasiia Kucherenko, François Brouchoud, Dimitri Percia David, Andrei Kucharavy | Published: 2026-06-23

Poster: Exploring the Limits of Audio-Based Detection of Turkish Phone Call Scams

Authors: Arda Eren, Micheal Cheung, Youqian Zhang, Grace Ngai, Eugene Yujun Fu | Published: 2026-06-23

Red-Teaming the Agentic Red-Team

Authors: Dario Pasquini, Michal Bazyli, Taras Fedynyshyn, Artem Sorokin | Published: 2026-06-23

PHANTOM: A Large-Scale Dataset of Multimodal Adversarial Attacks for Vision-Language Models

Authors: Simone Gallivanone, Hossein Khodadadi, Mauro Dore, Mauro Medda, Nicola Franco | Published: 2026-06-23