Feint and Attack: Attention-Based Strategies for Jailbreaking and Protecting LLMs Authors: Rui Pu, Chaozhuo Li, Rui Ha, Zejian Chen, Litian Zhang, Zheng Liu, Lirong Qiu, Zaisheng Ye | Published: 2024-10-18 | Updated: 2025-07-08 Disabling Safety Mechanisms of LLMPrompt InjectionPrompt validation 2024.10.18 2025.07.10 Literature Database
Unlearning Backdoor Attacks for LLMs with Weak-to-Strong Knowledge Distillation Authors: Shuai Zhao, Xiaobao Wu, Cong-Duy Nguyen, Yanhao Jia, Meihuizi Jia, Yichao Feng, Luu Anh Tuan | Published: 2024-10-18 | Updated: 2025-05-20 Backdoor DetectionBackdoor Attack TechniquesKnowledge Distillation 2024.10.18 2025.05.28 Literature Database
Private Counterfactual Retrieval Authors: Mohamed Nomeir, Pasan Dissanayake, Shreya Meel, Sanghamitra Dutta, Sennur Ulukus | Published: 2024-10-17 | Updated: 2025-07-24 Privacy Protection MethodDistance Evaluation MethodWatermark Evaluation 2024.10.17 2025.07.26 Literature Database
Low-Rank Adversarial PGD Attack Authors: Dayana Savostianova, Emanuele Zangrando, Francesco Tudisco | Published: 2024-10-16 Attack Method 2024.10.16 2025.05.27 Literature Database
Deep Learning Based XIoT Malware Analysis: A Comprehensive Survey, Taxonomy, and Research Challenges Authors: Rami Darwish, Mahmoud Abdelsalam, Sajad Khorsandroo | Published: 2024-10-14 XIoT Malware AnalysisMalware Classification 2024.10.14 2025.05.27 Literature Database
Denial-of-Service Poisoning Attacks against Large Language Models Authors: Kuofeng Gao, Tianyu Pang, Chao Du, Yong Yang, Shu-Tao Xia, Min Lin | Published: 2024-10-14 Prompt InjectionModel DoSResource Scarcity Issues 2024.10.14 2025.05.27 Literature Database
Adversarially Robust Out-of-Distribution Detection Using Lyapunov-Stabilized Embeddings Authors: Hossein Mirzaei, Mackenzie W. Mathis | Published: 2024-10-14 | Updated: 2025-01-26 Membership InferenceAdversarial Training 2024.10.14 2025.05.27 Literature Database
Towards Calibrated Losses for Adversarial Robust Reject Option Classification Authors: Vrund Shah, Tejas Chaudhari, Naresh Manwani | Published: 2024-10-14 Adversarial Training 2024.10.14 2025.05.27 Literature Database
Regularized Robustly Reliable Learners and Instance Targeted Attacks Authors: Avrim Blum, Donya Saless | Published: 2024-10-14 | Updated: 2025-05-08 Sample ComplexityRobustness EvaluationRobust Optimization 2024.10.14 2025.05.27 Literature Database
Model-based Large Language Model Customization as Service Authors: Zhaomin Wu, Jizhou Guo, Junyi Hou, Bingsheng He, Lixin Fan, Qiang Yang | Published: 2024-10-14 | Updated: 2025-05-22 Text Generation MethodPrivacy ManagementDifferential Privacy 2024.10.14 2025.05.28 Literature Database