Distributed machine learning generally aims at training a global model based
on distributed data without collecting all the data to a centralized location,
where two different approaches have been proposed: collecting and aggregating
local models (federated learning) and collecting and training over
representative data summaries (coreset). While each approach preserves data
privacy to some extent thanks to not sharing the raw data, the exact extent of
protection is unclear under sophisticated attacks that try to infer the raw
data from the shared information. We present the first comparison between the
two approaches in terms of target model accuracy, communication cost, and data
privacy, where the last is measured by the accuracy of a state-of-the-art
attack strategy called the membership inference attack. Our experiments
quantify the accuracy-privacy-cost tradeoff of each approach, and reveal a
nontrivial comparison that can be used to guide the design of model training
processes.