Machine learning (ML) models are applied in an increasing variety of domains.
The availability of large amounts of data and computational resources
encourages the development of ever more complex and valuable models. These
models are considered intellectual property of the legitimate parties who have
trained them, which makes their protection against stealing, illegitimate
redistribution, and unauthorized application an urgent need. Digital
watermarking presents a strong mechanism for marking model ownership and,
thereby, offers protection against those threats. This work presents a taxonomy
identifying and analyzing different classes of watermarking schemes for ML
models. It introduces a unified threat model to allow structured reasoning on
and comparison of the effectiveness of watermarking methods in different
scenarios. Furthermore, it systematizes desired security requirements and
attacks against ML model watermarking. Based on that framework, representative
literature from the field is surveyed to illustrate the taxonomy. Finally,
shortcomings and general limitations of existing approaches are discussed, and
an outlook on future research directions is given.