Literature Database

Align in Depth: Defending Jailbreak Attacks via Progressive Answer Detoxification

Authors: Yingjie Zhang, Tong Liu, Zhe Zhao, Guozhu Meng, Kai Chen | Published: 2025-03-14
Disabling Safety Mechanisms of LLM
Prompt Injection
Malicious Prompt

Tempest: Autonomous Multi-Turn Jailbreaking of Large Language Models with Tree Search

Authors: Andy Zhou, Ron Arel | Published: 2025-03-13 | Updated: 2025-05-21
Disabling Safety Mechanisms of LLM
Attack Method
Generative Model

CASTLE: Benchmarking Dataset for Static Code Analyzers and LLMs towards CWE Detection

Authors: Richard A. Dubniczky, Krisztofer Zoltán Horvát, Tamás Bisztray, Mohamed Amine Ferrag, Lucas C. Cordeiro, Norbert Tihanyi | Published: 2025-03-12 | Updated: 2025-03-31
Security Metric
Prompt leaking
Vulnerability Mitigation Technique

Adv-CPG: A Customized Portrait Generation Framework with Facial Adversarial Attacks

Authors: Junying Wang, Hongyuan Zhang, Yuan Yuan | Published: 2025-03-11
Privacy Protection
Adversarial Example
Face Recognition System

Split-n-Chain: Privacy-Preserving Multi-Node Split Learning with Blockchain-Based Auditability

Authors: Mukesh Sahani, Binanda Sengupta | Published: 2025-03-10 | Updated: 2025-04-15
Performance Evaluation
Privacy Protection Method
Distributed Learning

Queueing, Predictions, and LLMs: Challenges and Open Problems

Authors: Michael Mitzenmacher, Rana Shahout | Published: 2025-03-10
LLM Performance Evaluation
Scheduling Method
Prediction-Based Scheduling

How Well Can Differential Privacy Be Audited in One Run?

Authors: Amit Keinan, Moshe Shenfeld, Katrina Ligett | Published: 2025-03-10 | Updated: 2025-05-26
Privacy Issues
監査手法
Watermark Design

Probabilistic Modeling of Jailbreak on Multimodal LLMs: From Quantification to Application

Authors: Wenzhuo Xu, Zhipeng Wei, Xiongtao Sun, Zonghao Ying, Deyue Zhang, Dongdong Yang, Xiangzheng Zhang, Quanchen Zou | Published: 2025-03-10 | Updated: 2025-07-31
Prompt Injection
Large Language Model
Robustness of Watermarking Techniques

Secure On-Device Video OOD Detection Without Backpropagation

Authors: Shawn Li, Peilin Cai, Yuxiao Zhou, Zhiyu Ni, Renjie Liang, You Qin, Yi Nian, Zhengzhong Tu, Xiyang Hu, Yue Zhao | Published: 2025-03-08 | Updated: 2025-03-17
Privacy Protection Method
Framework
Deep Learning

Nearly Optimal Differentially Private ReLU Regression

Authors: Meng Ding, Mingxi Lei, Shaowei Wang, Tianhang Zheng, Di Wang, Jinhui Xu | Published: 2025-03-08 | Updated: 2025-06-10
Privacy Protection Mechanism
Convergence Property
Differential Privacy