Bayesian Pseudo Posterior Mechanism for Differentially Private Machine Learning

TOP Literature Database Bayesian Pseudo Posterior Mechanism for Differentially Private Machine Learning

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2503.21528

PDF

https://arxiv.org/pdf/2503.21528

Paper Information

Author: Robert Chew,Matthew R. Williams,Elan A. Segarra,Alexander J. Preiss,Amanda Konet,Terrance D. Savitsky
Published: 3-27-2025
Affiliation: Center for Data Science and AI, RTI International
Country: United States of America
Conference

Labels Estimated by AI

Privacy Issues Application of Text Classification Risk Assessment

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Differential privacy (DP) is becoming increasingly important for deployed machine learning applications because it provides strong guarantees for protecting the privacy of individuals whose data is used to train models. However, DP mechanisms commonly used in machine learning tend to struggle on many real world distributions, including highly imbalanced or small labeled training sets. In this work, we propose a new scalable DP mechanism for deep learning models, SWAG-PPM, by using a pseudo posterior distribution that downweights by-record likelihood contributions proportionally to their disclosure risks as the randomized mechanism. As a motivating example from official statistics, we demonstrate SWAG-PPM on a workplace injury text classification task using a highly imbalanced public dataset published by the U.S. Occupational Safety and Health Administration (OSHA). We find that SWAG-PPM exhibits only modest utility degradation against a non-private comparator while greatly outperforming the industry standard DP-SGD for a similar privacy budget.

External Datasets

U.S. Occupational Safety and Health Administration (OSHA) Severe Injury Reports