A Study of Backdoors in Instruction Fine-tuned Language Models

Authors: Jayaram Raghuram, George Kesidis, David J. Miller | Published: 2024-06-12 | Updated: 2024-08-21

Knowledge Return Oriented Prompting (KROP)

Authors: Jason Martin, Kenneth Yeung | Published: 2024-06-11

LLAMAFUZZ: Large Language Model Enhanced Greybox Fuzzing

Authors: Hongxiang Zhang, Yuyang Rong, Yifeng He, Hao Chen | Published: 2024-06-11 | Updated: 2024-06-13

Adversarial Machine Unlearning

Authors: Zonglin Di, Sixie Yu, Yevgeniy Vorobeychik, Yang Liu | Published: 2024-06-11

Beyond Words: On Large Language Models Actionability in Mission-Critical Risk Analysis

Authors: Matteo Esposito, Francesco Palagiano, Valentina Lenarduzzi, Davide Taibi | Published: 2024-06-11 | Updated: 2024-09-06

Erasing Radio Frequency Fingerprints via Active Adversarial Perturbation

Authors: Zhaoyi Lu, Wenchao Xu, Ming Tu, Xin Xie, Cunqing Hua, Nan Cheng | Published: 2024-06-11 | Updated: 2024-06-12

VulDetectBench: Evaluating the Deep Capability of Vulnerability Detection with Large Language Models

Authors: Yu Liu, Lang Gao, Mingxin Yang, Yu Xie, Ping Chen, Xiaojin Zhang, Wei Chen | Published: 2024-06-11 | Updated: 2024-08-21

MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models

Authors: Tianle Gu, Zeyang Zhou, Kexin Huang, Dandan Liang, Yixu Wang, Haiquan Zhao, Yuanqi Yao, Xingge Qiao, Keqing Wang, Yujiu Yang, Yan Teng, Yu Qiao, Yingchun Wang | Published: 2024-06-11 | Updated: 2024-06-13

Ollabench: Evaluating LLMs’ Reasoning for Human-centric Interdependent Cybersecurity

Authors: Tam n. Nguyen | Published: 2024-06-11

A Survey of Recent Backdoor Attacks and Defenses in Large Language Models

Authors: Shuai Zhao, Meihuizi Jia, Zhongliang Guo, Leilei Gan, Xiaoyu Xu, Xiaobao Wu, Jie Fu, Yichao Feng, Fengjun Pan, Luu Anh Tuan | Published: 2024-06-10 | Updated: 2025-01-04