Private Transformer Inference in MLaaS: A Survey

Authors: Yang Li, Xinyu Zhou, Yitong Wang, Liangxin Qian, Jun Zhao | Published: 2025-05-15

2025.05.152025.05.28

Authors: Yang Li, Xinyu Zhou, Yitong Wang, Liangxin Qian, Jun Zhao
Published: 2025-05-15

Source: https://arxiv.org/abs/2505.10315

PDF: https://arxiv.org/pdf/2505.10315

Labels Predicted by AI

Encryption Technology Computational Consistency Machine Learning

Please note that these labels were automatically added by AI. Therefore, they may not be entirely accurate.
For more details, please see the About the Literature Database page.

Abstract

Transformer models have revolutionized AI, powering applications like content generation and sentiment analysis. However, their deployment in Machine Learning as a Service (MLaaS) raises significant privacy concerns, primarily due to the centralized processing of sensitive user data. Private Transformer Inference (PTI) offers a solution by utilizing cryptographic techniques such as secure multi-party computation and homomorphic encryption, enabling inference while preserving both user data and model privacy. This paper reviews recent PTI advancements, highlighting state-of-the-art solutions and challenges. We also introduce a structured taxonomy and evaluation framework for PTI, focusing on balancing resource efficiency with privacy and bridging the gap between high-performance inference and data privacy.