Machine learning methods are widely used for a variety of prediction
problems. \emph{Prediction as a service} is a paradigm in which service
providers with technological expertise and computational resources may perform
predictions for clients. However, data privacy severely restricts the
applicability of such services, unless measures to keep client data private
(even from the service provider) are designed. Equally important is to minimize
the amount of computation and communication required between client and server.
Fully homomorphic encryption offers a possible way out, whereby clients may
encrypt their data, and on which the server may perform arithmetic
computations. The main drawback of using fully homomorphic encryption is the
amount of time required to evaluate large machine learning models on encrypted
data. We combine ideas from the machine learning literature, particularly work
on binarization and sparsification of neural networks, together with
algorithmic tools to speed-up and parallelize computation using encrypted data.