Adversarial attack

Support is All You Need for Certified VAE Training

Authors: Changming Xu, Debangshu Banerjee, Deepak Vasisht, Gagandeep Singh | Published: 2025-04-16
Improvement of Learning
Adversarial attack
Watermark Design

Language Models May Verbatim Complete Text They Were Not Explicitly Trained On

Authors: Ken Ziyu Liu, Christopher A. Choquette-Choo, Matthew Jagielski, Peter Kairouz, Sanmi Koyejo, Percy Liang, Nicolas Papernot | Published: 2025-03-21 | Updated: 2025-03-25
RAG
Membership Disclosure Risk
Adversarial attack

Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment

Authors: Soumya Suvra Ghosal, Souradip Chakraborty, Vaibhav Singh, Tianrui Guan, Mengdi Wang, Ahmad Beirami, Furong Huang, Alvaro Velasquez, Dinesh Manocha, Amrit Singh Bedi | Published: 2024-11-27 | Updated: 2025-03-20
Prompt Injection
Safety Alignment
Adversarial attack

Infighting in the Dark: Multi-Label Backdoor Attack in Federated Learning

Authors: Ye Li, Yanchao Zhao, Chengcheng Zhu, Jiale Zhang | Published: 2024-09-29 | Updated: 2025-03-22
ID Mapping Construction
Backdoor Detection
Adversarial attack

Data Reconstruction Attacks and Defenses: A Systematic Evaluation

Authors: Sheng Liu, Zihan Wang, Yuxiao Chen, Qi Lei | Published: 2024-02-13 | Updated: 2025-03-22
Privacy Analysis
Model Robustness
Adversarial attack

Explainable and Transferable Adversarial Attack for ML-Based Network Intrusion Detectors

Authors: Hangsheng Zhang, Dongqi Han, Yinlong Liu, Zhiliang Wang, Jiyan Sun, Shangyuan Zhuang, Jiqiang Liu, Jinsong Dong | Published: 2024-01-19
Poisoning
Model Interpretability
Adversarial attack

PuriDefense: Randomized Local Implicit Adversarial Purification for Defending Black-box Query-based Attacks

Authors: Ping Guo, Zhiyuan Yang, Xi Lin, Qingchuan Zhao, Qingfu Zhang | Published: 2024-01-19
Watermarking
Adversarial attack
Defense Method

A provable initialization and robust clustering method for general mixture models

Authors: Soham Jana, Jianqing Fan, Sanjeev Kulkarni | Published: 2024-01-10 | Updated: 2024-10-23
Clustering methods
Robustness Evaluation
Adversarial attack

Evasive Hardware Trojan through Adversarial Power Trace

Authors: Behnam Omidi, Khaled N. Khasawneh, Ihsen Alouani | Published: 2024-01-04
Watermarking
Adversarial attack
Watermark Robustness

Attack Tree Analysis for Adversarial Evasion Attacks

Authors: Yuki Yamaguchi, Toshiaki Aoki | Published: 2023-12-28
Poisoning
Adversarial attack
Watermark Evaluation