EEG-Defender: Defending against Jailbreak through Early Exit Generation of Large Language Models Authors: Chongwen Zhao, Zhihao Dou, Kaizhu Huang | Published: 2024-08-21 2024.08.21 2025.04.03 文献データベース
Hide Your Malicious Goal Into Benign Narratives: Jailbreak Large Language Models through Carrier Articles Authors: Zhilong Wang, Haizhou Wang, Nanqing Luo, Lan Zhang, Xiaoyan Sun, Yebo Cao, Peng Liu | Published: 2024-08-20 | Updated: 2025-02-07 2024.08.20 2025.04.03 文献データベース
Security Attacks on LLM-based Code Completion Tools Authors: Wen Cheng, Ke Sun, Xinyu Zhang, Wei Wang | Published: 2024-08-20 | Updated: 2025-01-02 2024.08.20 2025.04.03 文献データベース
Robust Image Classification: Defensive Strategies against FGSM and PGD Adversarial Attacks Authors: Hetvi Waghela, Jaydip Sen, Sneha Rakshit | Published: 2024-08-20 2024.08.20 2025.04.03 文献データベース
LeCov: Multi-level Testing Criteria for Large Language Models Authors: Xuan Xie, Jiayang Song, Yuheng Huang, Da Song, Fuyuan Zhang, Felix Juefei-Xu, Lei Ma | Published: 2024-08-20 2024.08.20 2025.04.03 文献データベース
Tracing Privacy Leakage of Language Models to Training Data via Adjusted Influence Functions Authors: Jinxin Liu, Zao Yang | Published: 2024-08-20 | Updated: 2024-09-05 2024.08.20 2025.04.03 文献データベース
Privacy Technologies for Financial Intelligence Authors: Yang Li, Thilina Ranbaduge, Kee Siong Ng | Published: 2024-08-19 2024.08.19 2025.04.03 文献データベース
Transferring Backdoors between Large Language Models by Knowledge Distillation Authors: Pengzhou Cheng, Zongru Wu, Tianjie Ju, Wei Du, Zhuosheng Zhang Gongshen Liu | Published: 2024-08-19 2024.08.19 2025.04.03 文献データベース
Regularization for Adversarial Robust Learning Authors: Jie Wang, Rui Gao, Yao Xie | Published: 2024-08-19 | Updated: 2024-08-22 2024.08.19 2025.04.03 文献データベース
Antidote: Post-fine-tuning Safety Alignment for Large Language Models against Harmful Fine-tuning Authors: Tiansheng Huang, Gautam Bhattacharya, Pratik Joshi, Josh Kimball, Ling Liu | Published: 2024-08-18 | Updated: 2024-09-03 2024.08.18 2025.04.03 文献データベース