Extending the Formalism and Theoretical Foundations of Cryptography to AI Authors: Federico Villa, F. Betül Durak, Tadayoshi Kohno, Tapdig Maharramli, Franziska Roesner | Published: 2026-03-03 Data Privacy Management安全性評価Threat Model 2026.03.03 2026.03.04 Literature Database
Co-Evolutionary Multi-Modal Alignment via Structured Adversarial Evolution Authors: Guoxin Shi, Haoyu Wang, Zaihui Yang, Yuxing Wang, Yongzhe Chang | Published: 2026-03-02 Alignment安全性評価機械学習応用 2026.03.02 2026.03.04 Literature Database
From Secure Agentic AI to Secure Agentic Web: Challenges, Threats, and Future Directions Authors: Zhihang Deng, Jiaping Gui, Weinan Zhang | Published: 2026-03-02 Indirect Prompt Injection安全性評価Threat Model 2026.03.02 2026.03.04 Literature Database
LLMs Can Unlearn Refusal with Only 1,000 Benign Samples Authors: Yangyang Guo, Ziwei Xu, Si Liu, Zhiming Zheng, Mohan Kankanhalli | Published: 2026-01-27 LLM活用Large Language Model安全性評価 2026.01.27 2026.01.29 Literature Database
The Scales of Justitia: A Comprehensive Survey on Safety Evaluation of LLMs Authors: Songyang Liu, Chaozhuo Li, Jiameng Qiu, Xi Zhang, Feiran Huang, Litian Zhang, Yiming Hei, Philip S. Yu | Published: 2025-06-06 | Updated: 2025-10-30 AlignmentLarge Language Model安全性評価 2025.06.06 2025.11.01 Literature Database
SafeCOMM: A Study on Safety Degradation in Fine-Tuned Telecom Large Language Models Authors: Aladin Djuhera, Swanand Ravindra Kadhe, Farhan Ahmed, Syed Zawad, Fernando Koch, Walid Saad, Holger Boche | Published: 2025-05-29 | Updated: 2025-10-27 Prompt InjectionLarge Language Model安全性評価 2025.05.29 2025.10.29 Literature Database