These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Machine learning (ML) is increasingly being adopted in a wide variety of
application domains. Usually, a well-performing ML model relies on a large
volume of training data and high-powered computational resources. Such a need
for and the use of huge volumes of data raise serious privacy concerns because
of the potential risks of leakage of highly privacy-sensitive information;
further, the evolving regulatory environments that increasingly restrict access
to and use of privacy-sensitive data add significant challenges to fully
benefiting from the power of ML for data-driven applications. A trained ML
model may also be vulnerable to adversarial attacks such as membership,
attribute, or property inference attacks and model inversion attacks. Hence,
well-designed privacy-preserving ML (PPML) solutions are critically needed for
many emerging applications. Increasingly, significant research efforts from
both academia and industry can be seen in PPML areas that aim toward
integrating privacy-preserving techniques into ML pipeline or specific
algorithms, or designing various PPML architectures. In particular, existing
PPML research cross-cut ML, systems and applications design, as well as
security and privacy areas; hence, there is a critical need to understand
state-of-the-art research, related challenges and a research roadmap for future
research in PPML area. In this paper, we systematically review and summarize
existing privacy-preserving approaches and propose a Phase, Guarantee, and
Utility (PGU) triad based model to understand and guide the evaluation of
various PPML solutions by decomposing their privacy-preserving functionalities.
We discuss the unique characteristics and challenges of PPML and outline
possible research directions that leverage as well as benefit multiple research
communities such as ML, distributed systems, security and privacy.