HoneyTrap: Deceiving Large Language Model Attackers to Honeypot Traps with Resilient Multi-Agent Defense Authors: Siyuan Li, Xi Lin, Jun Wu, Zehao Liu, Haoyu Li, Tianjie Ju, Xiang Chen, Jianhua Li | Published: 2026-01-07 Prompt InjectionLarge Language ModelAdversarial Attack Detection 2026.01.07 2026.01.09 Literature Database
Jailbreaking LLMs & VLMs: Mechanisms, Evaluation, and Unified Defense Authors: Zejian Chen, Chaozhuo Li, Chao Li, Xi Zhang, Litian Zhang, Yiming He | Published: 2026-01-07 Prompt InjectionLarge Language ModelAdversarial Attack Detection 2026.01.07 2026.01.09 Literature Database
JPU: Bridging Jailbreak Defense and Unlearning via On-Policy Path Rectification Authors: Xi Wang, Songlei Jian, Shasha Li, Xiaopeng Li, Zhaoye Li, Bin Ji, Baosheng Wang, Jie Yu | Published: 2026-01-06 Prompt InjectionModel Extraction AttackAdversarial Attack Detection 2026.01.06 2026.01.08 Literature Database
Casting a SPELL: Sentence Pairing Exploration for LLM Limitation-breaking Authors: Yifan Huang, Xiaojun Jia, Wenbo Guo, Yuqiang Sun, Yihao Huang, Chong Wang, Yang Liu | Published: 2025-12-24 Data Selection StrategyPrompt InjectionAdversarial Attack Detection 2025.12.24 2025.12.26 Literature Database
Unsourced Adversarial CAPTCHA: A Bi-Phase Adversarial CAPTCHA Framework Authors: Xia Du, Xiaoyuan Liu, Jizhe Zhou, Zheng Lin, Chi-man Pun, Zhe Chen, Wei Ni, Jun Luo | Published: 2025-06-12 Certified RobustnessAdversarial LearningAdversarial Attack Detection 2025.06.12 2025.06.14 Literature Database
Let the Noise Speak: Harnessing Noise for a Unified Defense Against Adversarial and Backdoor Attacks Authors: Md Hasan Shahriar, Ning Wang, Naren Ramakrishnan, Y. Thomas Hou, Wenjing Lou | Published: 2024-06-18 | Updated: 2025-04-14 Certified RobustnessReconstruction AttackAdversarial Attack Detection 2024.06.18 2025.05.27 Literature Database
Detecting Adversarial Spectrum Attacks via Distance to Decision Boundary Statistics Authors: Wenwei Zhao, Xiaowen Li, Shangqing Zhao, Jie Xu, Yao Liu, Zhuo Lu | Published: 2024-02-14 Adversarial ExampleAdversarial Spectrum Attack DetectionAdversarial Attack Detection 2024.02.14 2025.05.27 Literature Database
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast Authors: Xiangming Gu, Xiaosen Zheng, Tianyu Pang, Chao Du, Qian Liu, Ye Wang, Jing Jiang, Min Lin | Published: 2024-02-13 | Updated: 2024-06-03 LLM SecurityPrompt InjectionAdversarial Attack Detection 2024.02.13 2025.05.27 Literature Database
System-level Analysis of Adversarial Attacks and Defenses on Intelligence in O-RAN based Cellular Networks Authors: Azuka Chiejina, Brian Kim, Kaushik Chowhdury, Vijay K. Shah | Published: 2024-02-10 | Updated: 2024-02-13 O-RAN SecurityCyber AttackAdversarial Attack Detection 2024.02.10 2025.05.27 Literature Database
Comparing Spectral Bias and Robustness For Two-Layer Neural Networks: SGD vs Adaptive Random Fourier Features Authors: Aku Kammonen, Lisi Liang, Anamika Pandey, Raúl Tempone | Published: 2024-02-01 WatermarkingBiasAdversarial Attack Detection 2024.02.01 2025.05.27 Literature Database