Understanding and Mitigating Over-refusal for Large Language Models via Safety Representation

Authors: Junbo Zhang, Ran Chen, Qianli Zhou, Xinyang Deng, Wen Jiang | Published: 2025-11-24

LLM-CSEC: Empirical Evaluation of Security in C/C++ Code Generated by Large Language Models

Authors: Muhammad Usman Shahid, Chuadhry Mujeeb Ahmed, Rajiv Ranjan | Published: 2025-11-24

Defending Large Language Models Against Jailbreak Exploits with Responsible AI Considerations

Authors: Ryan Wong, Hosea David Yu Fei Ng, Dhananjai Sharma, Glenn Jun Jie Ng, Kavishvaran Srinivasan | Published: 2025-11-24

RoguePrompt: Dual-Layer Ciphering for Self-Reconstruction to Circumvent LLM Moderation

Authors: Benyamin Tafreshian | Published: 2025-11-24

Evaluation of Real-Time Mitigation Techniques for Cyber Security in IEC 61850 / IEC 62351 Substations

Authors: Akila Herath, Chen-Ching Liu, Junho Hong, Kuchan Park | Published: 2025-11-24

Subtract the Corruption: Training-Data-Free Corrective Machine Unlearning using Task Arithmetic

Authors: Mostafa Mozafari, Farooq Ahmad Wani, Maria Sofia Bucarelli, Fabrizio Silvestri | Published: 2025-11-24

Q-MLLM: Vector Quantization for Robust Multimodal Large Language Model Security

Authors: Wei Zhao, Zhe Li, Yige Li, Jun Sun | Published: 2025-11-20

PSM: Prompt Sensitivity Minimization via LLM-Guided Black-Box Optimization

Authors: Huseein Jawad, Nicolas Brunel | Published: 2025-11-20

ART: A Graph-based Framework for Investigating Illicit Activity in Monero via Address-Ring-Transaction Structures

Authors: Andrea Venturi, Imanol Jerico-Yoldi, Francesco Zola, Raul Orduna | Published: 2025-11-20

Small Language Models for Phishing Website Detection: Cost, Performance, and Privacy Trade-Offs

Authors: Georg Goldenits, Philip Koenig, Sebastian Raubitzek, Andreas Ekelhart | Published: 2025-11-19