Sandwich attack: Multi-language Mixture Adaptive Attack on LLMs Authors: Bibek Upadhayay, Vahid Behzadan | Published: 2024-04-09 LLM SecurityPrompt InjectionAttack Method 2024.04.09 2025.05.27 Literature Database
Aggressive or Imperceptible, or Both: Network Pruning Assisted Hybrid Byzantines in Federated Learning Authors: Emre Ozfatura, Kerem Ozfatura, Alptekin Kupcu, Deniz Gunduz | Published: 2024-04-09 PoisoningAttack MethodDefense Method 2024.04.09 2025.05.27 Literature Database
BruSLeAttack: A Query-Efficient Score-Based Black-Box Sparse Adversarial Attack Authors: Viet Quoc Vo, Ehsan Abbasnejad, Damith C. Ranasinghe | Published: 2024-04-08 | Updated: 2024-06-01 WatermarkingAttack MethodAdversarial Example 2024.04.08 2025.05.27 Literature Database
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks Authors: Maksym Andriushchenko, Francesco Croce, Nicolas Flammarion | Published: 2024-04-02 | Updated: 2024-10-07 LLM SecurityPrompt InjectionAttack Method 2024.04.02 2025.05.27 Literature Database
Humanizing Machine-Generated Content: Evading AI-Text Detection through Adversarial Attack Authors: Ying Zhou, Ben He, Le Sun | Published: 2024-04-02 LLM SecurityWatermarkingAttack Method 2024.04.02 2025.05.27 Literature Database
Adversarial Attacks and Defenses in Fault Detection and Diagnosis: A Comprehensive Benchmark on the Tennessee Eastman Process Authors: Vitaliy Pozdnyakov, Aleksandr Kovalenko, Ilya Makarov, Mikhail Drobyshevskiy, Kirill Lukyanov | Published: 2024-03-20 | Updated: 2024-06-07 Attack MethodAdversarial ExampleDefense Method 2024.03.20 2025.05.27 Literature Database
Robustness bounds on the successful adversarial examples in probabilistic models: Implications from Gaussian processes Authors: Hiroaki Maeshima, Akira Otsuka | Published: 2024-03-04 | Updated: 2025-03-19 Attack MethodAdversarial ExampleWatermark Evaluation 2024.03.04 2025.05.27 Literature Database
AutoAttacker: A Large Language Model Guided System to Implement Automatic Cyber-attacks Authors: Jiacen Xu, Jack W. Stokes, Geoff McDonald, Xuesong Bai, David Marshall, Siyue Wang, Adith Swaminathan, Zhou Li | Published: 2024-03-02 LLM SecurityPrompt InjectionAttack Method 2024.03.02 2025.05.27 Literature Database
Attacking Delay-based PUFs with Minimal Adversary Model Authors: Hongming Fei, Owen Millwood, Prosanta Gope, Jack Miskelly, Biplab Sikdar | Published: 2024-03-01 Evaluation Methods for PUFModel Performance EvaluationAttack Method 2024.03.01 2025.05.27 Literature Database
Coercing LLMs to do and reveal (almost) anything Authors: Jonas Geiping, Alex Stein, Manli Shu, Khalid Saifullah, Yuxin Wen, Tom Goldstein | Published: 2024-02-21 LLM SecurityPrompt InjectionAttack Method 2024.02.21 2025.05.27 Literature Database