These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
We propose TVineSynth, a vine copula based synthetic tabular data generator,
which is designed to balance privacy and utility, using the vine tree structure
and its truncation to do the trade-off. Contrary to synthetic data generators
that achieve DP by globally adding noise, TVineSynth performs a controlled
approximation of the estimated data generating distribution, so that it does
not suffer from poor utility of the resulting synthetic data for downstream
prediction tasks. TVineSynth introduces a targeted bias into the vine copula
model that, combined with the specific tree structure of the vine, causes the
model to zero out privacy-leaking dependencies while relying on those that are
beneficial for utility. Privacy is here measured with membership (MIA) and
attribute inference attacks (AIA). Further, we theoretically justify how the
construction of TVineSynth ensures AIA privacy under a natural privacy measure
for continuous sensitive attributes. When compared to competitor models, with
and without DP, on simulated and on real-world data, TVineSynth achieves a
superior privacy-utility balance.