Logistic Regression (LR) is the most widely used machine learning model in
industry for its efficiency, robustness, and interpretability. Due to the
problem of data isolation and the requirement of high model performance, many
applications in industry call for building a secure and efficient LR model for
multiple parties. Most existing work uses either Homomorphic Encryption (HE) or
Secret Sharing (SS) to build secure LR. HE based methods can deal with
high-dimensional sparse features, but they incur potential security risks. SS
based methods have provable security, but they have efficiency issue under
high-dimensional sparse features. In this paper, we first present CAESAR, which
combines HE and SS to build secure large-scale sparse logistic regression model
and achieves both efficiency and security. We then present the distributed
implementation of CAESAR for scalability requirement. We have deployed CAESAR
in a risk control task and conducted comprehensive experiments. Our
experimental results show that CAESAR improves the state-of-the-art model by
around 130 times.