On the Geometry of Regularization in Adversarial Training: High-Dimensional Asymptotics and Generalization Bounds Authors: Matteo Vilucchio, Nikolaos Tsilivis, Bruno Loureiro, Julia Kempe | Published: 2024-10-21 2024.10.21 2025.04.03 文献データベース
When Machine Unlearning Meets Retrieval-Augmented Generation (RAG): Keep Secret or Forget Knowledge? Authors: Shang Wang, Tianqing Zhu, Dayong Ye, Wanlei Zhou | Published: 2024-10-20 | Updated: 2025-10-13 2024.10.20 文献データベース
Jailbreaking and Mitigation of Vulnerabilities in Large Language Models Authors: Benji Peng, Keyu Chen, Qian Niu, Ziqian Bi, Ming Liu, Pohsun Feng, Tianyang Wang, Lawrence K. Q. Yan, Yizhu Wen, Yichao Zhang, Caitlyn Heqi Yin | Published: 2024-10-20 | Updated: 2025-05-08 2024.10.20 文献データベース
A Novel Reinforcement Learning Model for Post-Incident Malware Investigations Authors: Dipo Dunsin, Mohamed Chahine Ghanem, Karim Ouazzane, Vassil Vassilev | Published: 2024-10-19 | Updated: 2025-01-12 2024.10.19 2025.04.03 文献データベース
Enhancing Prompt Injection Attacks to LLMs via Poisoning Alignment Authors: Zedian Shao, Hongbin Liu, Jaden Mu, Neil Zhenqiang Gong | Published: 2024-10-18 | Updated: 2025-09-15 2024.10.18 文献データベース
Feint and Attack: Attention-Based Strategies for Jailbreaking and Protecting LLMs Authors: Rui Pu, Chaozhuo Li, Rui Ha, Zejian Chen, Litian Zhang, Zheng Liu, Lirong Qiu, Zaisheng Ye | Published: 2024-10-18 | Updated: 2025-07-08 2024.10.18 文献データベース
Unlearning Backdoor Attacks for LLMs with Weak-to-Strong Knowledge Distillation Authors: Shuai Zhao, Xiaobao Wu, Cong-Duy Nguyen, Yanhao Jia, Meihuizi Jia, Yichao Feng, Luu Anh Tuan | Published: 2024-10-18 | Updated: 2025-05-20 2024.10.18 文献データベース
Private Counterfactual Retrieval Authors: Mohamed Nomeir, Pasan Dissanayake, Shreya Meel, Sanghamitra Dutta, Sennur Ulukus | Published: 2024-10-17 | Updated: 2025-07-24 2024.10.17 文献データベース
FTSmartAudit: A Knowledge Distillation-Enhanced Framework for Automated Smart Contract Auditing Using Fine-Tuned LLMs Authors: Zhiyuan Wei, Jing Sun, Zijian Zhang, Xianhao Zhang, Zhe Hou | Published: 2024-10-17 | Updated: 2025-11-03 2024.10.17 文献データベース
Low-Rank Adversarial PGD Attack Authors: Dayana Savostianova, Emanuele Zangrando, Francesco Tudisco | Published: 2024-10-16 2024.10.16 2025.04.03 文献データベース