MPC-Minimized Secure LLM Inference | AIセキュリティポータル

EN

JA

EN

TOP 文献データベース MPC-Minimized Secure LLM Inference

arxiv

MPC-Minimized Secure LLM Inference

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2408.03561

PDF

https://arxiv.org/pdf/2408.03561

文献情報

作者: Deevashwer Rathee;Dacheng Li;Ion Stoica;Hao Zhang;Raluca Popa
公開日: 2024-8-7
所属機関: UC Berkeley
所属の国: United States of America
会議名: Computing Research Repository (CoRR)

AIにより推定されたラベル

MPCアルゴリズム LLM性能評価モデル性能評価

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Many inference services based on large language models (LLMs) pose a privacy concern, either revealing user prompts to the service or the proprietary weights to the user. Secure inference offers a solution to this problem through secure multi-party computation (MPC), however, it is still impractical for modern LLM workload due to the large overhead imposed by MPC. To address this overhead, we propose Marill, a framework that adapts LLM fine-tuning to minimize MPC usage during secure inference. Marill introduces high-level architectural changes during fine-tuning that significantly reduce the number of expensive operations needed within MPC during inference, by removing some and relocating others outside MPC without compromising security. As a result, Marill-generated models are more efficient across all secure inference protocols and our approach complements MPC-friendly approximations for such operations. Compared to standard fine-tuning, Marill results in 3.6-11.3x better runtime and 2.4-6.9x better communication during secure inference across various MPC settings, while typically preserving over 90% performance across downstream tasks.

外部データセット

ShareGPT

MTBench

MagiCoder

HumanEval

ParroT

WMT22

参考文献

ICLR

On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes

Rishabh Agarwal, Nino Vieillard, Yongchao Zhou, Piotr Stanczyk, Sabela Ramos, Matthieu Geist, Olivier Bachem

Published: 2024

Introducing the next generation of claude

Published: 2024

Fine-Tuning Llama-2: A Comprehensive Case Study for Tailoring Models to Unique Applications

Published: 2023

CCS

HELiKs: HE Linear Algebra Kernels for Secure Inference

Shashank Balla, Farinaz Koushanfar

Published: 2023

Proceedings of the 2021 conference of the north american chapter of the association for computational linguistics: human language technologies

On attention redundancy: A comprehensive study

Bian, Y., Huang, J., Cai, X., Yuan, J., Church, K.

Published: 2021

Towards Encrypted Large Language Models with FHE

Roman Bredehoft, Jordan Frery

Published: 2023

Journal of CRYPTOLOGY

Security and composition of multiparty cryptographic protocols

R. Canetti

Published: 2000

Evaluating Large Language Models Trained on Code

M. Chen, J. Tworek, H. Jun, Q. Yuan, H. P. D. O. Pinto, J. Kaplan, H. Edwards, Y. Burda, N. Joseph, G. Brockman et al.

Published: 2021

Findings of the Association for Computational Linguistics: ACL 2022

THE-X: privacy-preserving transformer inference with homomorphic encryption

Tianyu Chen, Hangbo Bao, Shaohan Huang, Li Dong, Binxing Jiao, Daxin Jiang, Haoyi Zhou, Jianxin Li, Furu Wei

Published: 2022