Training neural networks usually require large numbers of sensitive training
data, and how to protect the privacy of training data has thus become a
critical topic in deep learning research. InstaHide is a state-of-the-art
scheme to protect training data privacy with only minor effects on test
accuracy, and its security has become a salient question. In this paper, we
systematically study recent attacks on InstaHide and present a unified
framework to understand and analyze these attacks. We find that existing
attacks either do not have a provable guarantee or can only recover a single
private image. On the current InstaHide challenge setup, where each InstaHide
image is a mixture of two private images, we present a new algorithm to recover
all the private images with a provable guarantee and optimal sample complexity.
In addition, we also provide a computational hardness result on retrieving all
InstaHide images. Our results demonstrate that InstaHide is not
information-theoretically secure but computationally secure in the worst case,
even when mixing two private images.