Understanding and Mitigating Over-refusal for Large Language Models via Safety Representation Authors: Junbo Zhang, Ran Chen, Qianli Zhou, Xinyang Deng, Wen Jiang | Published: 2025-11-24 2025.11.24 文献データベース
LLM-CSEC: Empirical Evaluation of Security in C/C++ Code Generated by Large Language Models Authors: Muhammad Usman Shahid, Chuadhry Mujeeb Ahmed, Rajiv Ranjan | Published: 2025-11-24 2025.11.24 文献データベース
Defending Large Language Models Against Jailbreak Exploits with Responsible AI Considerations Authors: Ryan Wong, Hosea David Yu Fei Ng, Dhananjai Sharma, Glenn Jun Jie Ng, Kavishvaran Srinivasan | Published: 2025-11-24 2025.11.24 文献データベース
RoguePrompt: Dual-Layer Ciphering for Self-Reconstruction to Circumvent LLM Moderation Authors: Benyamin Tafreshian | Published: 2025-11-24 2025.11.24 文献データベース
Evaluation of Real-Time Mitigation Techniques for Cyber Security in IEC 61850 / IEC 62351 Substations Authors: Akila Herath, Chen-Ching Liu, Junho Hong, Kuchan Park | Published: 2025-11-24 2025.11.24 文献データベース
Subtract the Corruption: Training-Data-Free Corrective Machine Unlearning using Task Arithmetic Authors: Mostafa Mozafari, Farooq Ahmad Wani, Maria Sofia Bucarelli, Fabrizio Silvestri | Published: 2025-11-24 2025.11.24 文献データベース
Q-MLLM: Vector Quantization for Robust Multimodal Large Language Model Security Authors: Wei Zhao, Zhe Li, Yige Li, Jun Sun | Published: 2025-11-20 2025.11.20 文献データベース
PSM: Prompt Sensitivity Minimization via LLM-Guided Black-Box Optimization Authors: Huseein Jawad, Nicolas Brunel | Published: 2025-11-20 2025.11.20 文献データベース
ART: A Graph-based Framework for Investigating Illicit Activity in Monero via Address-Ring-Transaction Structures Authors: Andrea Venturi, Imanol Jerico-Yoldi, Francesco Zola, Raul Orduna | Published: 2025-11-20 2025.11.20 文献データベース
Small Language Models for Phishing Website Detection: Cost, Performance, and Privacy Trade-Offs Authors: Georg Goldenits, Philip Koenig, Sebastian Raubitzek, Andreas Ekelhart | Published: 2025-11-19 2025.11.19 文献データベース