These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Recent research has shown that structured machine learning models such as
tree ensembles are vulnerable to privacy attacks targeting their training data.
To mitigate these risks, differential privacy (DP) has become a widely adopted
countermeasure, as it offers rigorous privacy protection. In this paper, we
introduce a reconstruction attack targeting state-of-the-art $\epsilon$-DP
random forests. By leveraging a constraint programming model that incorporates
knowledge of the forest's structure and DP mechanism characteristics, our
approach formally reconstructs the most likely dataset that could have produced
a given forest. Through extensive computational experiments, we examine the
interplay between model utility, privacy guarantees and reconstruction accuracy
across various configurations. Our results reveal that random forests trained
with meaningful DP guarantees can still leak portions of their training data.
Specifically, while DP reduces the success of reconstruction attacks, the only
forests fully robust to our attack exhibit predictive performance no better
than a constant classifier. Building on these insights, we also provide
practical recommendations for the construction of DP random forests that are
more resilient to reconstruction attacks while maintaining a non-trivial
predictive performance.