Universal Multi-Party Poisoning Attacks

AIにより推定されたラベル
Abstract

In this work, we demonstrate universal multi-party poisoning attacks that adapt and apply to any multi-party learning process with arbitrary interaction pattern between the parties. More generally, we introduce and study (k, p)-poisoning attacks in which an adversary controls k ∈ [m] of the parties, and for each corrupted party Pi, the adversary submits some poisoned data 𝒯′i on behalf of Pi that is still “(1 − p)-close” to the correct data 𝒯i (e.g., 1 − p fraction of 𝒯′i is still honestly generated). We prove that for any “bad” property B of the final trained hypothesis h (e.g., h failing on a particular test example or having “large” risk) that has an arbitrarily small constant probability of happening without the attack, there always is a (k, p)-poisoning attack that increases the probability of B from μ to by μ1 − p ⋅ k/m = μ + Ω(p ⋅ k/m). Our attack only uses clean labels, and it is online. More generally, we prove that for any bounded function f(x1, …, xn) ∈ [0, 1] defined over an n-step random process X = (x1, …, xn), an adversary who can override each of the n blocks with even dependent probability p can increase the expected output by at least Ω(p ⋅ Var[f(x)]).

タイトルとURLをコピーしました