CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models

Authors: Yuetai Li, Zhangchen Xu, Fengqing Jiang, Luyao Niu, Dinuka Sahabandu, Bhaskar Ramasubramanian, Radha Poovendran | Published: 2024-06-18 | Updated: 2025-03-27

FullCert: Deterministic End-to-End Certification for Training and Inference of Neural Networks

Authors: Tobias Lorenz, Marta Kwiatkowska, Mario Fritz | Published: 2024-06-17 | Updated: 2024-09-11

ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates

Authors: Fengqing Jiang, Zhangchen Xu, Luyao Niu, Bill Yuchen Lin, Radha Poovendran | Published: 2024-06-17 | Updated: 2025-01-07

GoldCoin: Grounding Large Language Models in Privacy Laws via Contextual Integrity Theory

Authors: Wei Fan, Haoran Li, Zheye Deng, Weiqi Wang, Yangqiu Song | Published: 2024-06-17 | Updated: 2024-10-04

Threat Modelling and Risk Analysis for Large Language Model (LLM)-Powered Applications

Authors: Stephen Burabari Tete | Published: 2024-06-16

Promoting Data and Model Privacy in Federated Learning through Quantized LoRA

Authors: JianHao Zhu, Changze Lv, Xiaohua Wang, Muling Wu, Wenhao Liu, Tianlong Li, Zixuan Ling, Cenyuan Zhang, Xiaoqing Zheng, Xuanjing Huang | Published: 2024-06-16

Really Unlearned? Verifying Machine Unlearning via Influential Sample Pairs

Authors: Heng Xu, Tianqing Zhu, Lefeng Zhang, Wanlei Zhou | Published: 2024-06-16

Trading Devil: Robust backdoor attack via Stochastic investment models and Bayesian approach

Authors: Orson Mengara | Published: 2024-06-15 | Updated: 2024-09-16

Emerging Safety Attack and Defense in Federated Instruction Tuning of Large Language Models

Authors: Rui Ye, Jingyi Chai, Xiangrui Liu, Yaodong Yang, Yanfeng Wang, Siheng Chen | Published: 2024-06-15

RMF: A Risk Measurement Framework for Machine Learning Models

Authors: Jan Schröder, Jakub Breier | Published: 2024-06-15