The sparse vector technique is a powerful differentially private primitive
that allows an analyst to check whether queries in a stream are greater or
lesser than a threshold. This technique has a unique property -- the algorithm
works by adding noise with a finite variance to the queries and the threshold,
and guarantees privacy that only degrades with (a) the maximum sensitivity of
any one query in stream, and (b) the number of positive answers output by the
algorithm. Recent work has developed variants of this algorithm, which we call
{\em generalized private threshold testing}, and are claimed to have privacy
guarantees that do not depend on the number of positive or negative answers
output by the algorithm. These algorithms result in a significant improvement
in utility over the sparse vector technique for a given privacy budget, and
have found applications in frequent itemset mining, feature selection in
machine learning and generating synthetic data.
In this paper we critically analyze the privacy properties of generalized
private threshold testing. We show that generalized private threshold testing
does not satisfy \epsilon-differential privacy for any finite \epsilon. We
identify a subtle error in the privacy analysis of this technique in prior
work. Moreover, we show an adversary can use generalized private threshold
testing to recover counts from the datasets (especially small counts) exactly
with high accuracy, and thus can result in individuals being reidentified. We
demonstrate our attacks empirically on real datasets.