文献データベース

Robust Lipschitz Bandits to Adversarial Corruptions

Authors: Yue Kang, Cho-Jui Hsieh, Thomas C. M. Lee | Published: 2023-05-29 | Updated: 2023-10-08

強化学習

敵対的攻撃

機械学習手法

2023.05.29 2025.04.03

文献データベース

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Authors: Rafael Rafailov, Archit Sharma, Eric Mitchell, Stefano Ermon, Christopher D. Manning, Chelsea Finn | Published: 2023-05-29 | Updated: 2024-07-29

アライメント

報酬メカニズム設計

強化学習最適化

2023.05.29 2025.04.03

文献データベース

Membership Inference Attacks against Language Models via Neighbourhood Comparison

Authors: Justus Mattern, Fatemehsadat Mireshghallah, Zhijing Jin, Bernhard Schölkopf, Mrinmaya Sachan, Taylor Berg-Kirkpatrick | Published: 2023-05-29 | Updated: 2023-08-07

LLM性能評価

プライバシー保護手法

防御手法

2023.05.29 2025.04.03

文献データベース

LLMs Can Understand Encrypted Prompt: Towards Privacy-Computing Friendly Transformers

Authors: Xuanqi Liu, Zhuotao Liu | Published: 2023-05-28 | Updated: 2023-12-15

DNN IP保護手法

LLM性能評価

プライバシー保護手法

2023.05.28 2025.04.03

文献データベース

The Curse of Recursion: Training on Generated Data Makes Models Forget

Authors: Ilia Shumailov, Zakhar Shumaylov, Yiren Zhao, Yarin Gal, Nicolas Papernot, Ross Anderson | Published: 2023-05-27 | Updated: 2024-04-14

LLM性能評価

サンプリング手法

モデルの解釈性

2023.05.27 2025.04.03

文献データベース

Improved Privacy-Preserving PCA Using Optimized Homomorphic Matrix Multiplication

Authors: Xirong Ma | Published: 2023-05-27 | Updated: 2023-08-17

プライバシー保護手法

収束特性

暗号化手法

2023.05.27 2025.04.03

文献データベース

On Evaluating Adversarial Robustness of Large Vision-Language Models

Authors: Yunqing Zhao, Tianyu Pang, Chao Du, Xiao Yang, Chongxuan Li, Ngai-Man Cheung, Min Lin | Published: 2023-05-26 | Updated: 2023-10-29

LLM性能評価

プロンプトインジェクション

敵対的攻撃

2023.05.26 2025.04.03

文献データベース

CyPhERS: A Cyber-Physical Event Reasoning System providing real-time situational awareness for attack and fault response

Authors: Nils Müller, Kaibin Bao, Jörg Matthes, Kai Heussen | Published: 2023-05-26

CPSの制御モデル

サイバー攻撃

異常検出手法

2023.05.26 2025.04.03

文献データベース

Undetectable Watermarks for Language Models

Authors: Miranda Christ, Sam Gunn, Or Zamir | Published: 2023-05-25

プロンプトリーキング

生成AI向け電子透かし

透かし技術

2023.05.25 2025.04.03

文献データベース

Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy

Authors: Zhihong Shao, Yeyun Gong, Yelong Shen, Minlie Huang, Nan Duan, Weizhu Chen | Published: 2023-05-24 | Updated: 2023-10-23

RAG

人工知能の役割

情報検索

2023.05.24 2025.04.03

文献データベース