In privacy-preserving machine learning, it is common that the owner of the
learned model does not have any physical access to the data. Instead, only a
secured remote access to a data lake is granted to the model owner without any
ability to retrieve data from the data lake. Yet, the model owner may want to
export the trained model periodically from the remote repository and a question
arises whether this may cause is a risk of data leakage. In this paper, we
introduce the concept of data stealing attack during the export of neural
networks. It consists in hiding some information in the exported network that
allows the reconstruction outside the data lake of images initially stored in
that data lake. More precisely, we show that it is possible to train a network
that can perform lossy image compression and at the same time solve some
utility tasks such as image segmentation. The attack then proceeds by exporting
the compression decoder network together with some image codes that leads to
the image reconstruction outside the data lake. We explore the feasibility of
such attacks on databases of CT and MR images, showing that it is possible to
obtain perceptually meaningful reconstructions of the target dataset, and that
the stolen dataset can be used in turns to solve a broad range of tasks.
Comprehensive experiments and analyses show that data stealing attacks should
be considered as a threat for sensitive imaging data sources.