These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Sparse inner product (SIP) has the attractive property of overhead being
dominated by the intersection of inputs between parties, independent of the
actual input size. It has intriguing prospects, especially for boosting machine
learning on large-scale data, which is tangled with sparse data. In this paper,
we investigate privacy-preserving SIP problems that have rarely been explored
before. Specifically, we propose two concrete constructs, one requiring offline
linear communication which can be amortized across queries, while the other has
sublinear overhead but relies on the more computationally expensive tool. Our
approach exploits state-of-the-art cryptography tools including garbled Bloom
filters (GBF) and Private Information Retrieval (PIR) as the cornerstone, but
carefully fuses them to obtain non-trivial overhead reductions. We provide
formal security analysis of the proposed constructs and implement them into
representative machine learning algorithms including k-nearest neighbors, naive
Bayes classification and logistic regression. Compared to the existing efforts,
our method achieves $2$-$50\times$ speedup in runtime and up to $10\times$
reduction in communication.