プロンプトインジェクション

Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM

Authors: Bochuan Cao, Yuanpu Cao, Lu Lin, Jinghui Chen | Published: 2023-09-18 | Updated: 2024-06-12

プロンプトインジェクション

安全性アライメント

防御手法

2023.09.18 2025.04.03

文献データベース

FuzzLLM: A Novel and Universal Fuzzing Framework for Proactively Discovering Jailbreak Vulnerabilities in Large Language Models

Authors: Dongyu Yao, Jianshu Zhang, Ian G. Harris, Marcel Carlsson | Published: 2023-09-11 | Updated: 2024-04-14

LLMセキュリティ

ウォーターマーキング

プロンプトインジェクション

2023.09.11 2025.04.03

文献データベース

Vulnerability of Machine Learning Approaches Applied in IoT-based Smart Grid: A Review

Authors: Zhenyong Zhang, Mengxiang Liu, Mingyang Sun, Ruilong Deng, Peng Cheng, Dusit Niyato, Mo-Yuen Chow, Jiming Chen | Published: 2023-08-30 | Updated: 2023-12-25

エネルギー管理

プロンプトインジェクション

敵対的訓練

2023.08.30 2025.04.03

文献データベース

Detecting Language Model Attacks with Perplexity

Authors: Gabriel Alon, Michael Kamfonas | Published: 2023-08-27 | Updated: 2023-11-07

LLMセキュリティ

プロンプトインジェクション

悪意のあるプロンプト

2023.08.27 2025.04.03

文献データベース

Use of LLMs for Illicit Purposes: Threats, Prevention Measures, and Vulnerabilities

Authors: Maximilian Mozes, Xuanli He, Bennett Kleinberg, Lewis D. Griffin | Published: 2023-08-24

プロンプトインジェクション

悪意のあるコンテンツ生成

敵対的サンプル

2023.08.24 2025.04.03

文献データベース

Devising and Detecting Phishing: Large Language Models vs. Smaller Human Models

Authors: Fredrik Heiding, Bruce Schneier, Arun Vishwanath, Jeremy Bernstein, Peter S. Park | Published: 2023-08-23 | Updated: 2023-11-30

フィッシング

フィッシング攻撃

プロンプトインジェクション

2023.08.23 2025.04.03

文献データベース

Time Travel in LLMs: Tracing Data Contamination in Large Language Models

Authors: Shahriar Golchin, Mihai Surdeanu | Published: 2023-08-16 | Updated: 2024-02-21

データ汚染検出

プロンプトインジェクション

自然言語処理

2023.08.16 2025.04.03

文献データベース

Robustness Over Time: Understanding Adversarial Examples’ Effectiveness on Longitudinal Versions of Large Language Models

Authors: Yugeng Liu, Tianshuo Cong, Zhengyu Zhao, Michael Backes, Yun Shen, Yang Zhang | Published: 2023-08-15 | Updated: 2024-05-06

プロンプトインジェクション

モデル性能評価

ロバスト性に関する評価

2023.08.15 2025.04.03

文献データベース

PentestGPT: An LLM-empowered Automatic Penetration Testing Tool

Authors: Gelei Deng, Yi Liu, Víctor Mayoral-Vilches, Peng Liu, Yuekang Li, Yuan Xu, Tianwei Zhang, Yang Liu, Martin Pinzger, Stefan Rass | Published: 2023-08-13 | Updated: 2024-06-02

プロンプトインジェクション

ペネトレーションテスト手法

性能評価

2023.08.13 2025.04.03

文献データベース

An Empirical Study on Using Large Language Models to Analyze Software Supply Chain Security Failures

Authors: Tanmay Singla, Dharun Anandayuvaraj, Kelechi G. Kalu, Taylor R. Schorlemmer, James C. Davis | Published: 2023-08-09

サイバー攻撃

プロンプトインジェクション

モデル性能評価

2023.08.09 2025.04.03

文献データベース