These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
The Private Aggregation of Teacher Ensembles (PATE) framework enables
privacy-preserving machine learning by aggregating responses from disjoint
subsets of sensitive data. Adaptations of PATE to tasks with inherent output
diversity such as text generation, where the desired output is a sample from a
distribution, face a core tension: as diversity increases, samples from
different teachers are less likely to agree, but lower agreement results in
reduced utility for the same privacy requirements. Yet suppressing diversity to
artificially increase agreement is undesirable, as it distorts the output of
the underlying model, and thus reduces output quality.
We propose Hot PATE, a variant of PATE designed for diverse generative
settings. We formalize the notion of a diversity-preserving ensemble sampler
and introduce an efficient sampler that provably transfers diversity without
incurring additional privacy cost. Hot PATE requires only API access to
proprietary models and can be used as a drop-in replacement for existing Cold
PATE samplers. Our empirical evaluations corroborate and quantify the benefits,
showing significant improvements in the privacy utility trade-off on evaluated
in-context learning tasks, both in preserving diversity and in returning
relevant responses.