Attack Evaluation

SecurityNet: Assessing Machine Learning Vulnerabilities on Public Models

Authors: Boyang Zhang, Zheng Li, Ziqing Yang, Xinlei He, Michael Backes, Mario Fritz, Yang Zhang | Published: 2023-10-19
Membership Inference
Model Extraction Attack
Attack Evaluation

Attack Prompt Generation for Red Teaming and Defending Large Language Models

Authors: Boyi Deng, Wenjie Wang, Fuli Feng, Yang Deng, Qifan Wang, Xiangnan He | Published: 2023-10-19
Prompt Injection
Attack Evaluation
Adversarial Example

Last One Standing: A Comparative Analysis of Security and Privacy of Soft Prompt Tuning, LoRA, and In-Context Learning

Authors: Rui Wen, Tianhao Wang, Michael Backes, Yang Zhang, Ahmed Salem | Published: 2023-10-17
Privacy Technique
Model Extraction Attack
Attack Evaluation

BufferSearch: Generating Black-Box Adversarial Texts With Lower Queries

Authors: Wenjie Lv, Zhen Wang, Yitao Zheng, Zhehua Zhong, Qi Xuan, Tianyi Chen | Published: 2023-10-14
Attack Evaluation
Adversarial Example
Optimization Methods

On the Feasibility of Cross-Language Detection of Malicious Packages in npm and PyPI

Authors: Piergiorgio Ladisa, Serena Elisa Ponta, Nicola Ronzoni, Matias Martinez, Olivier Barais | Published: 2023-10-14
Malicious Package Detection
Attack Evaluation
Feature Selection Method

Catastrophic Jailbreak of Open-source LLMs via Exploiting Generation

Authors: Yangsibo Huang, Samyak Gupta, Mengzhou Xia, Kai Li, Danqi Chen | Published: 2023-10-10
Prompt Injection
Attack Evaluation
Adversarial attack

Test-Time Poisoning Attacks Against Test-Time Adaptation Models

Authors: Tianshuo Cong, Xinlei He, Yun Shen, Yang Zhang | Published: 2023-08-16
Poisoning
Model Performance Evaluation
Attack Evaluation

Diff-CAPTCHA: An Image-based CAPTCHA with Security Enhanced by Denoising Diffusion Model

Authors: Ran Jiang, Sanfeng Zhang, Linfeng Liu, Yanbing Peng | Published: 2023-08-16
Security Assurance
Attack Evaluation
Watermark Robustness

Understanding Multi-Turn Toxic Behaviors in Open-Domain Chatbots

Authors: Bocheng Chen, Guangjing Wang, Hanqing Guo, Yuanda Wang, Qiben Yan | Published: 2023-07-14
Prompt Injection
Dialogue System
Attack Evaluation

Group-based Robustness: A General Framework for Customized Robustness in the Real World

Authors: Weiran Lin, Keane Lucas, Neo Eyal, Lujo Bauer, Michael K. Reiter, Mahmood Sharif | Published: 2023-06-29 | Updated: 2024-03-10
Group-Based Robustness
Attack Evaluation
Adversarial Attack Detection