Adversarial Attack Detection

HoneyTrap: Deceiving Large Language Model Attackers to Honeypot Traps with Resilient Multi-Agent Defense

Authors: Siyuan Li, Xi Lin, Jun Wu, Zehao Liu, Haoyu Li, Tianjie Ju, Xiang Chen, Jianhua Li | Published: 2026-01-07
Prompt Injection
Large Language Model
Adversarial Attack Detection

Jailbreaking LLMs & VLMs: Mechanisms, Evaluation, and Unified Defense

Authors: Zejian Chen, Chaozhuo Li, Chao Li, Xi Zhang, Litian Zhang, Yiming He | Published: 2026-01-07
Prompt Injection
Large Language Model
Adversarial Attack Detection

JPU: Bridging Jailbreak Defense and Unlearning via On-Policy Path Rectification

Authors: Xi Wang, Songlei Jian, Shasha Li, Xiaopeng Li, Zhaoye Li, Bin Ji, Baosheng Wang, Jie Yu | Published: 2026-01-06
Prompt Injection
Model Extraction Attack
Adversarial Attack Detection

Casting a SPELL: Sentence Pairing Exploration for LLM Limitation-breaking

Authors: Yifan Huang, Xiaojun Jia, Wenbo Guo, Yuqiang Sun, Yihao Huang, Chong Wang, Yang Liu | Published: 2025-12-24
Data Selection Strategy
Prompt Injection
Adversarial Attack Detection

Unsourced Adversarial CAPTCHA: A Bi-Phase Adversarial CAPTCHA Framework

Authors: Xia Du, Xiaoyuan Liu, Jizhe Zhou, Zheng Lin, Chi-man Pun, Zhe Chen, Wei Ni, Jun Luo | Published: 2025-06-12
Certified Robustness
Adversarial Learning
Adversarial Attack Detection

Let the Noise Speak: Harnessing Noise for a Unified Defense Against Adversarial and Backdoor Attacks

Authors: Md Hasan Shahriar, Ning Wang, Naren Ramakrishnan, Y. Thomas Hou, Wenjing Lou | Published: 2024-06-18 | Updated: 2025-04-14
Certified Robustness
Reconstruction Attack
Adversarial Attack Detection

Detecting Adversarial Spectrum Attacks via Distance to Decision Boundary Statistics

Authors: Wenwei Zhao, Xiaowen Li, Shangqing Zhao, Jie Xu, Yao Liu, Zhuo Lu | Published: 2024-02-14
Adversarial Example
Adversarial Spectrum Attack Detection
Adversarial Attack Detection

Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast

Authors: Xiangming Gu, Xiaosen Zheng, Tianyu Pang, Chao Du, Qian Liu, Ye Wang, Jing Jiang, Min Lin | Published: 2024-02-13 | Updated: 2024-06-03
LLM Security
Prompt Injection
Adversarial Attack Detection

System-level Analysis of Adversarial Attacks and Defenses on Intelligence in O-RAN based Cellular Networks

Authors: Azuka Chiejina, Brian Kim, Kaushik Chowhdury, Vijay K. Shah | Published: 2024-02-10 | Updated: 2024-02-13
O-RAN Security
Cyber Attack
Adversarial Attack Detection

Comparing Spectral Bias and Robustness For Two-Layer Neural Networks: SGD vs Adaptive Random Fourier Features

Authors: Aku Kammonen, Lisi Liang, Anamika Pandey, Raúl Tempone | Published: 2024-02-01
Watermarking
Bias
Adversarial Attack Detection