Multi-Designated Detector Watermarking for Language Models Authors: Zhengan Huang, Gongxian Zeng, Xin Mu, Yu Wang, Yue Yu | Published: 2024-09-26 | Updated: 2024-10-01 LLM SecurityWatermarkingWatermark Evaluation 2024.09.26 2025.05.12 Literature Database
The poison of dimensionality Authors: Lê-Nguyên Hoang | Published: 2024-09-25 PoisoningModel Performance EvaluationLoss Function 2024.09.25 2025.05.12 Literature Database
Pretraining Data Detection for Large Language Models: A Divergence-based Calibration Method Authors: Weichao Zhang, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Yixing Fan, Xueqi Cheng | Published: 2024-09-23 | Updated: 2025-04-01 Disabling Safety Mechanisms of LLMModel Performance EvaluationInformation Extraction 2024.09.23 2025.05.12 Literature Database
Order of Magnitude Speedups for LLM Membership Inference Authors: Rongting Zhang, Martin Bertran, Aaron Roth | Published: 2024-09-22 | Updated: 2024-09-24 LLM SecurityMembership InferenceLow-Cost Membership Inference Method 2024.09.22 2025.05.12 Literature Database
PathSeeker: Exploring LLM Security Vulnerabilities with a Reinforcement Learning-Based Jailbreak Approach Authors: Zhihao Lin, Wei Ma, Mingyi Zhou, Yanjie Zhao, Haoyu Wang, Yang Liu, Jun Wang, Li Li | Published: 2024-09-21 | Updated: 2024-10-03 LLM Performance EvaluationPrompt Injection 2024.09.21 2025.05.12 Literature Database
Obliviate: Neutralizing Task-agnostic Backdoors within the Parameter-efficient Fine-tuning Paradigm Authors: Jaehan Kim, Minkyoo Song, Seung Ho Na, Seungwon Shin | Published: 2024-09-21 | Updated: 2024-10-06 Backdoor AttackModel Performance EvaluationDefense Method 2024.09.21 2025.05.12 Literature Database
MalMixer: Few-Shot Malware Classification with Retrieval-Augmented Semi-Supervised Learning Authors: Jiliang Li, Yifan Zhang, Yu Huang, Kevin Leach | Published: 2024-09-20 | Updated: 2025-04-17 Data Augmentation MethodPoisoningMalware Detection with Limited Samples 2024.09.20 2025.05.12 Literature Database
Extracting Memorized Training Data via Decomposition Authors: Ellen Su, Anu Vellore, Amy Chang, Raffaele Mura, Blaine Nelson, Paul Kassianik, Amin Karbasi | Published: 2024-09-18 | Updated: 2024-10-01 Training Data Extraction MethodPrompting StrategyModel Performance Evaluation 2024.09.18 2025.05.12 Literature Database
Artemis: Efficient Commit-and-Prove SNARKs for zkML Authors: Hidde Lycklama, Alexander Viand, Nikolay Avramov, Nicolas Küchler, Anwar Hithnawi | Published: 2024-09-18 FrameworkModel Performance EvaluationCryptography 2024.09.18 2025.05.12 Literature Database
Hard-Label Cryptanalytic Extraction of Neural Network Models Authors: Yi Chen, Xiaoyang Dong, Jian Guo, Yantian Shen, Anyu Wang, Xiaoyun Wang | Published: 2024-09-18 Model Extraction AttackAttack MethodComputational Complexity 2024.09.18 2025.05.12 Literature Database