This paper considers the scenario that multiple data owners wish to apply a
machine learning method over the combined dataset of all owners to obtain the
best possible learning output but do not want to share the local datasets owing
to privacy concerns. We design systems for the scenario that the stochastic
gradient descent (SGD) algorithm is used as the machine learning method because
SGD (or its variants) is at the heart of recent deep learning techniques over
neural networks. Our systems differ from existing systems in the following
features: {\bf (1)} any activation function can be used, meaning that no
privacy-preserving-friendly approximation is required; {\bf (2)} gradients
computed by SGD are not shared but the weight parameters are shared instead;
and {\bf (3)} robustness against colluding parties even in the extreme case
that only one honest party exists. We prove that our systems, while
privacy-preserving, achieve the same learning accuracy as SGD and hence retain
the merit of deep learning with respect to accuracy. Finally, we conduct
several experiments using benchmark datasets, and show that our systems
outperform previous system in terms of learning accuracies.