On Discrete Prompt Optimization for Diffusion Models

TOP 文献データベース On Discrete Prompt Optimization for Diffusion Models

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2407.01606

PDF

https://arxiv.org/pdf/2407.01606

文献情報

作者: Ruochen Wang;Ting Liu;Cho-Jui Hsieh;Boqing Gong
公開日: 2024-6-27
所属機関: Google Research
所属の国: United States of America
会議名: International Conference on Machine Learning (ICML)

AIにより推定されたラベル

プロンプトインジェクションプロンプトエンジニアリングウォーターマーキング

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

This paper introduces the first gradient-based framework for prompt optimization in text-to-image diffusion models. We formulate prompt engineering as a discrete optimization problem over the language space. Two major challenges arise in efficiently finding a solution to this problem: (1) Enormous Domain Space: Setting the domain to the entire language space poses significant difficulty to the optimization process. (2) Text Gradient: Efficiently computing the text gradient is challenging, as it requires backpropagating through the inference steps of the diffusion model and a non-differentiable embedding lookup table. Beyond the problem formulation, our main technical contributions lie in solving the above challenges. First, we design a family of dynamically generated compact subspaces comprised of only the most relevant words to user input, substantially restricting the domain space. Second, we introduce "Shortcut Text Gradient" -- an effective replacement for the text gradient that can be obtained with constant memory and runtime. Empirical evaluation on prompts collected from diverse sources (DiffusionDB, ChatGPT, COCO) suggests that our method can discover prompts that substantially improve (prompt enhancement) or destroy (adversarial attack) the faithfulness of images generated by the text-to-image diffusion model.

外部データセット

DiffusionDB

COCO

ChatGPT