Prompt Injection

Latent-space adversarial training with post-aware calibration for defending large language models against jailbreak attacks

Authors: Xin Yi, Yue Li, Linlin Wang, Xiaoling Wang, Liang He | Published: 2025-01-18
Prompt Injection
Adversarial Training
Excessive Denial Mitigation

Computing Optimization-Based Prompt Injections Against Closed-Weights Models By Misusing a Fine-Tuning API

Authors: Andrey Labunets, Nishit V. Pandya, Ashish Hooda, Xiaohan Fu, Earlence Fernandes | Published: 2025-01-16
Prompt Injection
Attack Evaluation
Optimization Problem

A Survey on Responsible LLMs: Inherent Risk, Malicious Use, and Mitigation Strategy

Authors: Huandong Wang, Wenjie Fu, Yingzhou Tang, Zhilong Chen, Yuxi Huang, Jinghua Piao, Chen Gao, Fengli Xu, Tao Jiang, Yong Li | Published: 2025-01-16
Survey Paper
Privacy Protection
Prompt Injection
Large Language Model

Gandalf the Red: Adaptive Security for LLMs

Authors: Niklas Pfister, Václav Volhejn, Manuel Knott, Santiago Arias, Julia Bazińska, Mykhailo Bichurin, Alan Commike, Janet Darling, Peter Dienes, Matthew Fiedler, David Haber, Matthias Kraft, Marco Lancini, Max Mathys, Damián Pascual-Ortiz, Jakub Podolak, Adrià Romero-López, Kyriacos Shiarlis, Andreas Signer, Zsolt Terek, Athanasios Theocharis, Daniel Timbrell, Samuel Trautwein, Samuel Watts, Yun-Han Wu, Mateo Rojas-Carulla | Published: 2025-01-14 | Updated: 2025-08-04
Prompt Injection
Model Extraction Attack
User Behavior Analysis

Unveiling Provider Bias in Large Language Models for Code Generation

Authors: Xiaoyu Zhang, Juan Zhai, Shiqing Ma, Qingshuang Bao, Weipeng Jiang, Chao Shen, Yang Liu | Published: 2025-01-14
Code Generation
Bias
Prompt Injection

Automating the Detection of Code Vulnerabilities by Analyzing GitHub Issues

Authors: Daniele Cipollone, Changjie Wang, Mariano Scazzariello, Simone Ferlin, Maliheh Izadi, Dejan Kostic, Marco Chiesa | Published: 2025-01-09
LLM Performance Evaluation
Prompt Injection
Vulnerability Management

SpaLLM-Guard: Pairing SMS Spam Detection Using Open-source and Commercial LLMs

Authors: Muhammad Salman, Muhammad Ikram, Nardine Basta, Mohamed Ali Kaafar | Published: 2025-01-09
LLM Performance Evaluation
Prompt Injection
Improvement of Learning

Jailbreaking Multimodal Large Language Models via Shuffle Inconsistency

Authors: Shiji Zhao, Ranjie Duan, Fengxiang Wang, Chi Chen, Caixin Kang, Jialing Tao, YueFeng Chen, Hui Xue, Xingxing Wei | Published: 2025-01-09
Text Shuffle Inconsistency
Prompt Injection
Attack Method

Exploring Large Language Models for Semantic Analysis and Categorization of Android Malware

Authors: Brandon J Walton, Mst Eshita Khatun, James M Ghawaly, Aisha Ali-Gombe | Published: 2025-01-08
Prompt Injection
Prompt Engineering
Malware Classification

PromptGuard: Soft Prompt-Guided Unsafe Content Moderation for Text-to-Image Models

Authors: Lingzhi Yuan, Xinfeng Li, Chejian Xu, Guanhong Tao, Xiaojun Jia, Yihao Huang, Wei Dong, Yang Liu, XiaoFeng Wang, Bo Li | Published: 2025-01-07
Content Moderation
Soft Prompt Optimization
Prompt Injection