プロンプトインジェクション

LLMs Cannot Reliably Judge (Yet?): A Comprehensive Assessment on the Robustness of LLM-as-a-Judge

Authors: Songze Li, Chuokun Xu, Jiaying Wang, Xueluan Gong, Chen Chen, Jirui Zhang, Jun Wang, Kwok-Yan Lam, Shouling Ji | Published: 2025-06-11

LLMの安全機構の解除

プロンプトインジェクション

敵対的攻撃

2025.06.11

文献データベース

Design Patterns for Securing LLM Agents against Prompt Injections

Authors: Luca Beurer-Kellner, Beat Buesser Ana-Maria Creţu, Edoardo Debenedetti, Daniel Dobos, Daniel Fabian, Marc Fischer, David Froelicher, Kathrin Grosse, Daniel Naeff, Ezinwanne Ozoani, Andrew Paverd, Florian Tramèr, Václav Volhejn | Published: 2025-06-10 | Updated: 2025-06-11

インダイレクトプロンプトインジェクション

プロンプトインジェクション

防御手法

2025.06.10

文献データベース

MalGEN: A Generative Agent Framework for Modeling Malicious Software in Cybersecurity

Authors: Bikash Saha, Sandeep Kumar Shukla | Published: 2025-06-09

サイバー脅威

プロンプトインジェクション

マルウェア生成

2025.06.09

文献データベース

JavelinGuard: Low-Cost Transformer Architectures for LLM Security

Authors: Yash Datta, Sharath Rajasekar | Published: 2025-06-09

プライバシー保護技術

プロンプトインジェクション

モデルアーキテクチャ

2025.06.09

文献データベース

Chain-of-Code Collapse: Reasoning Failures in LLMs via Adversarial Prompting in Code Generation

Authors: Jaechul Roh, Varun Gandhi, Shivani Anilkumar, Arin Garg | Published: 2025-06-08 | Updated: 2025-06-12

パフォーマンス評価

プロンプトインジェクション

プロンプトリーキング

2025.06.08

文献データベース

Evaluating Apple Intelligence’s Writing Tools for Privacy Against Large Language Model-Based Inference Attacks: Insights from Early Datasets

Authors: Mohd. Farhan Israk Soumik, Syed Mhamudul Hasan, Abdur R. Shahid | Published: 2025-06-04

テキスト分類の応用

プライバシー問題

プロンプトインジェクション

2025.06.04

文献データベース

Client-Side Zero-Shot LLM Inference for Comprehensive In-Browser URL Analysis

Authors: Avihay Cohen | Published: 2025-06-04

アライメント

プロンプトインジェクション

動的分析

2025.06.04

文献データベース

CyberGym: Evaluating AI Agents’ Cybersecurity Capabilities with Real-World Vulnerabilities at Scale

Authors: Zhun Wang, Tianneng Shi, Jingxuan He, Matthew Cai, Jialin Zhang, Dawn Song | Published: 2025-06-03

プロンプトインジェクション

動的分析手法

透かし評価

2025.06.03

文献データベース

BitBypass: A New Direction in Jailbreaking Aligned Large Language Models with Bitstream Camouflage

Authors: Kalyan Nakka, Nitesh Saxena | Published: 2025-06-03

LLMの安全機構の解除

フィッシング攻撃の検出率

プロンプトインジェクション

2025.06.03

文献データベース

Beyond the Protocol: Unveiling Attack Vectors in the Model Context Protocol (MCP) Ecosystem

Authors: Hao Song, Yiming Shen, Wenxuan Luo, Leixin Guo, Ting Chen, Jiashui Wang, Beibei Li, Xiaosong Zhang, Jiachi Chen | Published: 2025-05-31 | Updated: 2025-08-20

インダイレクトプロンプトインジェクション

プロンプトインジェクション

攻撃タイプ

2025.05.31

文献データベース