Be Your Own Red Teamer: Safety Alignment via Self-Play and Reflective Experience Replay Authors: Hao Wang, Yanting Wang, Hao Li, Rui Li, Lei Sha | Published: 2026-01-15 Prompt InjectionAdversarial Attack AnalysisSelf-Learning Method 2026.01.15 2026.01.17 Literature Database
SecureCAI: Injection-Resilient LLM Assistants for Cybersecurity Operations Authors: Mohammed Himayath Ali, Mohammed Aqib Abdullah, Mohammed Mudassir Uddin, Shahnawaz Alam | Published: 2026-01-12 Indirect Prompt InjectionPrompt InjectionAdversarial Attack Analysis 2026.01.12 2026.01.14 Literature Database
Defenses Against Prompt Attacks Learn Surface Heuristics Authors: Shawn Li, Chenxiao Yu, Zhiyu Ni, Hao Li, Charith Peris, Chaowei Xiao, Yue Zhao | Published: 2026-01-12 Prompt leakingPerformance EvaluationAdversarial Attack Analysis 2026.01.12 2026.01.14 Literature Database
When Reject Turns into Accept: Quantifying the Vulnerability of LLM-Based Scientific Reviewers to Indirect Prompt Injection Authors: Devanshu Sahoo, Manish Prasad, Vasudev Majhi, Jahnvi Singh, Vinay Chamola, Yash Sinha, Murari Mandal, Dhruv Kumar | Published: 2025-12-11 Indirect Prompt InjectionAdversarial Attack AnalysisEvaluation Method 2025.12.11 2025.12.13 Literature Database
DUMB and DUMBer: Is Adversarial Training Worth It in the Real World? Authors: Francesco Marchiori, Marco Alecci, Luca Pajola, Mauro Conti | Published: 2025-06-23 Model ArchitectureCertified RobustnessAdversarial Attack Analysis 2025.06.23 2025.06.25 Literature Database
Exploring Backdoor Attack and Defense for LLM-empowered Recommendations Authors: Liangbo Ning, Wenqi Fan, Qing Li | Published: 2025-04-15 LLM Performance EvaluationPoisoning attack on RAGAdversarial Attack Analysis 2025.04.15 2025.05.27 Literature Database
Bypassing Prompt Injection and Jailbreak Detection in LLM Guardrails Authors: William Hackett, Lewis Birch, Stefan Trawicki, Neeraj Suri, Peter Garraghan | Published: 2025-04-15 | Updated: 2025-04-16 LLM Performance EvaluationPrompt InjectionAdversarial Attack Analysis 2025.04.15 2025.05.27 Literature Database
Adversarial Attacks Against Medical Deep Learning Systems Authors: Samuel G. Finlayson, Hyung Won Chung, Isaac S. Kohane, Andrew L. Beam | Published: 2018-04-15 | Updated: 2019-02-04 Adversarial LearningAdversarial Attack AnalysisDeep Learning 2018.04.15 2025.05.28 Literature Database
A Grid Based Adversarial Clustering Algorithm Authors: Wutao Wei, Nikhil Gupta, Bowei Xi | Published: 2018-04-13 | Updated: 2024-11-21 Data Contamination DetectionAdversarial Attack AnalysisAnomaly Detection Method 2018.04.13 2025.05.28 Literature Database
Label Sanitization against Label Flipping Poisoning Attacks Authors: Andrea Paudice, Luis Muñoz-González, Emil C. Lupu | Published: 2018-03-02 | Updated: 2018-10-02 Adversarial Attack AnalysisMachine Learning TechnologyDetection of Poisonous Data 2018.03.02 2025.05.28 Literature Database