LLM性能評価

Membership Inference Attacks against Language Models via Neighbourhood Comparison

Authors: Justus Mattern, Fatemehsadat Mireshghallah, Zhijing Jin, Bernhard Schölkopf, Mrinmaya Sachan, Taylor Berg-Kirkpatrick | Published: 2023-05-29 | Updated: 2023-08-07
LLM性能評価
プライバシー保護手法
防御手法

LLMs Can Understand Encrypted Prompt: Towards Privacy-Computing Friendly Transformers

Authors: Xuanqi Liu, Zhuotao Liu | Published: 2023-05-28 | Updated: 2023-12-15
DNN IP保護手法
LLM性能評価
プライバシー保護手法

The Curse of Recursion: Training on Generated Data Makes Models Forget

Authors: Ilia Shumailov, Zakhar Shumaylov, Yiren Zhao, Yarin Gal, Nicolas Papernot, Ross Anderson | Published: 2023-05-27 | Updated: 2024-04-14
LLM性能評価
サンプリング手法
モデルの解釈性

On Evaluating Adversarial Robustness of Large Vision-Language Models

Authors: Yunqing Zhao, Tianyu Pang, Chao Du, Xiao Yang, Chongxuan Li, Ngai-Man Cheung, Min Lin | Published: 2023-05-26 | Updated: 2023-10-29
LLM性能評価
プロンプトインジェクション
敵対的攻撃

Quantifying Association Capabilities of Large Language Models and Its Implications on Privacy Leakage

Authors: Hanyin Shao, Jie Huang, Shen Zheng, Kevin Chen-Chuan Chang | Published: 2023-05-22 | Updated: 2024-02-09
LLM性能評価
プライバシー侵害
プライバシー保護手法