These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
We investigate the application of large language models (LLMs), specifically
GPT-4, to scenarios involving the tradeoff between privacy and utility in
tabular data. Our approach entails prompting GPT-4 by transforming tabular data
points into textual format, followed by the inclusion of precise sanitization
instructions in a zero-shot manner. The primary objective is to sanitize the
tabular data in such a way that it hinders existing machine learning models
from accurately inferring private features while allowing models to accurately
infer utility-related attributes. We explore various sanitization instructions.
Notably, we discover that this relatively simple approach yields performance
comparable to more complex adversarial optimization methods used for managing
privacy-utility tradeoffs. Furthermore, while the prompts successfully obscure
private features from the detection capabilities of existing machine learning
models, we observe that this obscuration alone does not necessarily meet a
range of fairness metrics. Nevertheless, our research indicates the potential
effectiveness of LLMs in adhering to these fairness metrics, with some of our
experimental results aligning with those achieved by well-established
adversarial optimization techniques.