Align in Depth: Defending Jailbreak Attacks via Progressive Answer Detoxification

Authors: Yingjie Zhang, Tong Liu, Zhe Zhao, Guozhu Meng, Kai Chen | Published: 2025-03-14

Siege: Autonomous Multi-Turn Jailbreaking of Large Language Models with Tree Search

Authors: Andy Zhou | Published: 2025-03-13 | Updated: 2025-03-16

CASTLE: Benchmarking Dataset for Static Code Analyzers and LLMs towards CWE Detection

Authors: Richard A. Dubniczky, Krisztofer Zoltán Horvát, Tamás Bisztray, Mohamed Amine Ferrag, Lucas C. Cordeiro, Norbert Tihanyi | Published: 2025-03-12 | Updated: 2025-03-31

Adv-CPG: A Customized Portrait Generation Framework with Facial Adversarial Attacks

Authors: Junying Wang, Hongyuan Zhang, Yuan Yuan | Published: 2025-03-11

Split-n-Chain: Privacy-Preserving Multi-Node Split Learning with Blockchain-Based Auditability

Authors: Mukesh Sahani, Binanda Sengupta | Published: 2025-03-10 | Updated: 2025-04-15

Queueing, Predictions, and LLMs: Challenges and Open Problems

Authors: Michael Mitzenmacher, Rana Shahout | Published: 2025-03-10

How Well Can Differential Privacy Be Audited in One Run?

Authors: Amit Keinan, Moshe Shenfeld, Katrina Ligett | Published: 2025-03-10 | Updated: 2025-05-26

Secure On-Device Video OOD Detection Without Backpropagation

Authors: Shawn Li, Peilin Cai, Yuxiao Zhou, Zhiyu Ni, Renjie Liang, You Qin, Yi Nian, Zhengzhong Tu, Xiyang Hu, Yue Zhao | Published: 2025-03-08 | Updated: 2025-03-17

Nearly Optimal Differentially Private ReLU Regression

Authors: Meng Ding, Mingxi Lei, Shaowei Wang, Tianhang Zheng, Di Wang, Jinhui Xu | Published: 2025-03-08 | Updated: 2025-06-10

ToxicSQL: Migrating SQL Injection Threats into Text-to-SQL Models via Backdoor Attack

Authors: Meiyu Lin, Haichuan Zhang, Jiale Lao, Renyuan Li, Yuanchun Zhou, Carl Yang, Yang Cao, Mingjie Tang | Published: 2025-03-07 | Updated: 2025-04-03