プロンプトインジェクション

Automated Static Vulnerability Detection via a Holistic Neuro-symbolic Approach

Authors: Penghui Li, Songchen Yao, Josef Sarfati Korich, Changhua Luo, Jianjia Yu, Yinzhi Cao, Junfeng Yang | Published: 2025-04-22
クエリ生成手法
プロンプトインジェクション
脆弱性検出

Exploring the Role of Large Language Models in Cybersecurity: A Systematic Survey

Authors: Shuang Tian, Tao Zhang, Jiqiang Liu, Jiacheng Wang, Xuangou Wu, Xiaoqiang Zhu, Ruichen Zhang, Weiting Zhang, Zhenhui Yuan, Shiwen Mao, Dong In Kim | Published: 2025-04-22
インダイレクトプロンプトインジェクション
プロンプトインジェクション
大規模言語モデル

A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment

Authors: Kun Wang, Guibin Zhang, Zhenhong Zhou, Jiahao Wu, Miao Yu, Shiqian Zhao, Chenlong Yin, Jinhu Fu, Yibo Yan, Hanjun Luo, Liang Lin, Zhihao Xu, Haolang Lu, Xinye Cao, Xinyun Zhou, Weifei Jin, Fanci Meng, Junyuan Mao, Hao Wu, Minghe Wang, Fan Zhang, Junfeng Fang, Chengwei Liu, Yifan Zhang, Qiankun Li, Chongye Guo, Yalan Qin, Yi Ding, Donghai Hong, Jiaming Ji, Xinfeng Li, Yifan Jiang, Dongxia Wang, Yihao Huang, Yufei Guo, Jen-tse Huang, Yanwei Yue, Wenke Huang, Guancheng Wan, Tianlin Li, Lei Bai, Jie Zhang, Qing Guo, Jingyi Wang, Tianlong Chen, Joey Tianyi Zhou, Xiaojun Jia, Weisong Sun, Cong Wu, Jing Chen, Xuming Hu, Yiming Li, Xiao Wang, Ningyu Zhang, Luu Anh Tuan, Guowen Xu, Tianwei Zhang, Xingjun Ma, Xiang Wang, Bo An, Jun Sun, Mohit Bansal, Shirui Pan, Yuval Elovici, Bhavya Kailkhura, Bo Li, Yaodong Yang, Hongwei Li, Wenyuan Xu, Yizhou Sun, Wei Wang, Qing Li, Ke Tang, Yu-Gang Jiang, Felix Juefei-Xu, Hui Xiong, Xiaofeng Wang, Shuicheng Yan, Dacheng Tao, Philip S. Yu, Qingsong Wen, Yang Liu | Published: 2025-04-22
アライメント
データ生成の安全性
プロンプトインジェクション

BadApex: Backdoor Attack Based on Adaptive Optimization Mechanism of Black-box Large Language Models

Authors: Zhengxian Wu, Juan Wen, Wanli Peng, Ziwei Zhang, Yinghan Zhou, Yiming Xue | Published: 2025-04-18 | Updated: 2025-04-21
プロンプトインジェクション
攻撃検出
透かし技術

GraphAttack: Exploiting Representational Blindspots in LLM Safety Mechanisms

Authors: Sinan He, An Wang | Published: 2025-04-17
アライメント
プロンプトインジェクション
脆弱性研究

The Digital Cybersecurity Expert: How Far Have We Come?

Authors: Dawei Wang, Geng Zhou, Xianglong Li, Yu Bai, Li Chen, Ting Qin, Jian Sun, Dan Li | Published: 2025-04-16
LLM性能評価
RAGへのポイズニング攻撃
プロンプトインジェクション

Bypassing Prompt Injection and Jailbreak Detection in LLM Guardrails

Authors: William Hackett, Lewis Birch, Stefan Trawicki, Neeraj Suri, Peter Garraghan | Published: 2025-04-15
LLM性能評価
プロンプトインジェクション
敵対的攻撃分析

Do We Really Need Curated Malicious Data for Safety Alignment in Multi-modal Large Language Models?

Authors: Yanbo Wang, Jiyang Guan, Jian Liang, Ran He | Published: 2025-04-14
プロンプトインジェクション
学習データの偏り
安全性アライメント

An Investigation of Large Language Models and Their Vulnerabilities in Spam Detection

Authors: Qiyao Tang, Xiangyang Li | Published: 2025-04-14
LLM性能評価
プロンプトインジェクション
モデルDoS

Sugar-Coated Poison: Benign Generation Unlocks LLM Jailbreaking

Authors: Yu-Hang Wu, Yu-Jie Xiong, Jie-Zhang | Published: 2025-04-08
LLMの応用
プロンプトインジェクション
大規模言語モデル