Secure multi-party machine learning allows several parties to build a model
on their pooled data to increase utility while not explicitly sharing data with
each other. We show that such multi-party computation can cause leakage of
global dataset properties between the parties even when parties obtain only
black-box access to the final model. In particular, a ``curious'' party can
infer the distribution of sensitive attributes in other parties' data with high
accuracy. This raises concerns regarding the confidentiality of properties
pertaining to the whole dataset as opposed to individual data records. We show
that our attack can leak population-level properties in datasets of different
types, including tabular, text, and graph data. To understand and measure the
source of leakage, we consider several models of correlation between a
sensitive attribute and the rest of the data. Using multiple machine learning
models, we show that leakage occurs even if the sensitive attribute is not
included in the training data and has a low correlation with other attributes
or the target variable.