The Erasure Illusion: Stress-Testing the Generalization of LLM Forgetting Evaluation Authors: Hengrui Jia, Taoran Li, Jonas Guan, Varun Chandrasekaran | Published: 2025-12-22 LLM活用Challenges of Generative ModelsTransparency and Verification 2025.12.22 2025.12.24 Literature Database
DREAM: Dynamic Red-teaming across Environments for AI Models Authors: Liming Lu, Xiang Gu, Junyu Huang, Jiawei Du, Yunhuai Liu, Yongbin Zhou, Shuchao Pang | Published: 2025-12-22 Model Robustness動的攻撃評価手法Vulnerability Attack Method 2025.12.22 2025.12.24 Literature Database
Efficient Jailbreak Mitigation Using Semantic Linear Classification in a Multi-Staged Pipeline Authors: Akshaj Prashanth Rao, Advait Singh, Saumya Kumaar Saksena, Dhruv Kumar | Published: 2025-12-22 Prompt InjectionWatermarkDefense Mechanism 2025.12.22 2025.12.24 Literature Database
Phishing Detection System: An Ensemble Approach Using Character-Level CNN and Feature Engineering Authors: Rudra Dubey, Arpit Mani Tripathi, Archit Srivastava, Sarvpal Singh | Published: 2025-12-18 Ensemble Learning次世代フィッシング検出Feature Extraction 2025.12.18 2025.12.20 Literature Database
Prefix Probing: Lightweight Harmful Content Detection for Large Language Models Authors: Jirui Yang, Hengqi Guo, Zhihui Lu, Yi Zhao, Yuansen Zhang, Shijing Hu, Qiang Duan, Yinggui Wang, Tao Wei | Published: 2025-12-18 Token Distribution AnalysisPrompt InjectionPrompt leaking 2025.12.18 2025.12.20 Literature Database
A Systematic Study of Code Obfuscation Against LLM-based Vulnerability Detection Authors: Xiao Li, Yue Li, Hao Wu, Yue Zhang, Yechao Zhang, Fengyuan Xu, Sheng Zhong | Published: 2025-12-18 Indirect Prompt InjectionPrompt Injection難読化手法 2025.12.18 2025.12.20 Literature Database
From Essence to Defense: Adaptive Semantic-aware Watermarking for Embedding-as-a-Service Copyright Protection Authors: Hao Li, Yubing Ren, Yanan Cao, Yingjie Li, Fang Fang, Xuebin Wang | Published: 2025-12-18 著作権保護WatermarkWatermark Robustness 2025.12.18 2025.12.20 Literature Database
Large Language Models as a (Bad) Security Norm in the Context of Regulation and Compliance Authors: Kaspar Rosager Ludvigsen | Published: 2025-12-18 LLM活用Indirect Prompt InjectionLarge Language Model 2025.12.18 2025.12.20 Literature Database
Agent Tools Orchestration Leaks More: Dataset, Benchmark, and Mitigation Authors: Yuxuan Qiao, Dongqin Liu, Hongchang Yang, Wei Zhou, Songlin Hu | Published: 2025-12-18 Data LeakagePrivacy-Preserving Machine LearningWatermark 2025.12.18 2025.12.20 Literature Database
In-Context Probing for Membership Inference in Fine-Tuned Language Models Authors: Zhexi Lu, Hongliang Chi, Nathalie Baracaldo, Swanand Ravindra Kadhe, Yuseok Jeon, Lei Yu | Published: 2025-12-18 Bias Detection in AI OutputPrivacy-Preserving Machine LearningPrompt leaking 2025.12.18 2025.12.20 Literature Database