AttestLLM: Efficient Attestation Framework for Billion-scale On-device LLMs Authors: Ruisi Zhang, Yifei Zhao, Neusha Javidnia, Mengxin Zheng, Farinaz Koushanfar | Published: 2025-09-08 Security Strategy GenerationEfficiency EvaluationLarge Language Model 2025.09.08 2025.09.10 Literature Database
VulnRepairEval: An Exploit-Based Evaluation Framework for Assessing Large Language Model Vulnerability Repair Capabilities Authors: Weizhe Wang, Wei Ma, Qiang Hu, Yao Zhang, Jianfei Sun, Bin Wu, Yang Liu, Guangquan Xu, Lingxiao Jiang | Published: 2025-09-03 Prompt InjectionLarge Language ModelVulnerability Analysis 2025.09.03 2025.09.05 Literature Database
Safety Alignment Should Be Made More Than Just A Few Attention Heads Authors: Chao Huang, Zefeng Zhang, Juewei Yue, Quangang Li, Chuang Zhang, Tingwen Liu | Published: 2025-08-27 Prompt InjectionLarge Language ModelAttention Mechanism 2025.08.27 2025.08.29 Literature Database
Confusion is the Final Barrier: Rethinking Jailbreak Evaluation and Investigating the Real Misuse Threat of LLMs Authors: Yu Yan, Sheng Sun, Zhe Wang, Yijun Lin, Zenghao Duan, zhifei zheng, Min Liu, Zhiyi yin, Jianping Zhang | Published: 2025-08-22 | Updated: 2025-09-15 Privacy Assessment倫理基準遵守Large Language Model 2025.08.22 2025.09.17 Literature Database
MCPSecBench: A Systematic Security Benchmark and Playground for Testing Model Context Protocols Authors: Yixuan Yang, Daoyuan Wu, Yufan Chen | Published: 2025-08-17 | Updated: 2025-10-09 Prompt leakingLarge Language ModelDefense Mechanism 2025.08.17 2025.10.11 Literature Database
Jailbreaking Commercial Black-Box LLMs with Explicitly Harmful Prompts Authors: Chiyu Zhang, Lu Zhou, Xiaogang Xu, Jiafei Wu, Liming Fang, Zhe Liu | Published: 2025-08-14 Social Engineering AttackPrompt InjectionLarge Language Model 2025.08.14 2025.08.16 Literature Database
EditMF: Drawing an Invisible Fingerprint for Your Large Language Models Authors: Jiaxuan Wu, Yinghan Zhou, Wanli Peng, Yiming Xue, Juan Wen, Ping Zhong | Published: 2025-08-12 Large Language ModelAuthor Attribution MethodWatermark Design 2025.08.12 2025.08.14 Literature Database
Repairing vulnerabilities without invisible hands. A differentiated replication study on LLMs Authors: Maria Camporese, Fabio Massacci | Published: 2025-07-28 Prompt InjectionLarge Language ModelVulnerability Management 2025.07.28 2025.07.30 Literature Database
ARMOR: Aligning Secure and Safe Large Language Models via Meticulous Reasoning Authors: Zhengyue Zhao, Yingzi Ma, Somesh Jha, Marco Pavone, Patrick McDaniel, Chaowei Xiao | Published: 2025-07-14 | Updated: 2025-10-20 Large Language Model安全性分析評価基準 2025.07.14 2025.10.22 Literature Database
GuardVal: Dynamic Large Language Model Jailbreak Evaluation for Comprehensive Safety Testing Authors: Peiyan Zhang, Haibo Jin, Liying Kang, Haohan Wang | Published: 2025-07-10 Prompt validationLarge Language ModelPerformance Evaluation Metrics 2025.07.10 2025.07.12 Literature Database