Integrating uncertainty quantification into randomized smoothing based robustness guarantees Authors: Sina Däubener, Kira Maag, David Krueger, Asja Fischer | Published: 2024-10-27 Adversarial ExampleEquivalence Evaluation 2024.10.27 2025.05.27 Literature Database
On the Geometry of Regularization in Adversarial Training: High-Dimensional Asymptotics and Generalization Bounds Authors: Matteo Vilucchio, Nikolaos Tsilivis, Bruno Loureiro, Julia Kempe | Published: 2024-10-21 Convergence AnalysisAdversarial Training 2024.10.21 2025.05.27 Literature Database
Jailbreaking and Mitigation of Vulnerabilities in Large Language Models Authors: Benji Peng, Keyu Chen, Qian Niu, Ziqian Bi, Ming Liu, Pohsun Feng, Tianyang Wang, Lawrence K. Q. Yan, Yizhu Wen, Yichao Zhang, Caitlyn Heqi Yin | Published: 2024-10-20 | Updated: 2025-05-08 LLM SecurityDisabling Safety Mechanisms of LLMPrompt Injection 2024.10.20 2025.05.27 Literature Database
A Novel Reinforcement Learning Model for Post-Incident Malware Investigations Authors: Dipo Dunsin, Mohamed Chahine Ghanem, Karim Ouazzane, Vassil Vassilev | Published: 2024-10-19 | Updated: 2025-01-12 CybersecurityMalware Classification 2024.10.19 2025.05.27 Literature Database
Feint and Attack: Attention-Based Strategies for Jailbreaking and Protecting LLMs Authors: Rui Pu, Chaozhuo Li, Rui Ha, Zejian Chen, Litian Zhang, Zheng Liu, Lirong Qiu, Zaisheng Ye | Published: 2024-10-18 | Updated: 2025-07-08 Disabling Safety Mechanisms of LLMPrompt InjectionPrompt validation 2024.10.18 2025.07.10 Literature Database
Unlearning Backdoor Attacks for LLMs with Weak-to-Strong Knowledge Distillation Authors: Shuai Zhao, Xiaobao Wu, Cong-Duy Nguyen, Yanhao Jia, Meihuizi Jia, Yichao Feng, Luu Anh Tuan | Published: 2024-10-18 | Updated: 2025-05-20 Backdoor DetectionBackdoor Attack TechniquesKnowledge Distillation 2024.10.18 2025.05.28 Literature Database
Private Counterfactual Retrieval Authors: Mohamed Nomeir, Pasan Dissanayake, Shreya Meel, Sanghamitra Dutta, Sennur Ulukus | Published: 2024-10-17 | Updated: 2025-07-24 Privacy Protection MethodDistance Evaluation MethodWatermark Evaluation 2024.10.17 2025.07.26 Literature Database
Low-Rank Adversarial PGD Attack Authors: Dayana Savostianova, Emanuele Zangrando, Francesco Tudisco | Published: 2024-10-16 Attack Method 2024.10.16 2025.05.27 Literature Database
Deep Learning Based XIoT Malware Analysis: A Comprehensive Survey, Taxonomy, and Research Challenges Authors: Rami Darwish, Mahmoud Abdelsalam, Sajad Khorsandroo | Published: 2024-10-14 XIoT Malware AnalysisMalware Classification 2024.10.14 2025.05.27 Literature Database
Denial-of-Service Poisoning Attacks against Large Language Models Authors: Kuofeng Gao, Tianyu Pang, Chao Du, Yong Yang, Shu-Tao Xia, Min Lin | Published: 2024-10-14 Prompt InjectionModel DoSResource Scarcity Issues 2024.10.14 2025.05.27 Literature Database