Backdoor attacks mislead machine-learning models to output an
attacker-specified class when presented a specific trigger at test time. These
attacks require poisoning the training data to compromise the learning
algorithm, e.g., by injecting poisoning samples containing the trigger into the
training set, along with the desired class label. Despite the increasing number
of studies on backdoor attacks and defenses, the underlying factors affecting
the success of backdoor attacks, along with their impact on the learning
algorithm, are not yet well understood. In this work, we aim to shed light on
this issue by unveiling that backdoor attacks induce a smoother decision
function around the triggered samples -- a phenomenon which we refer to as
\textit{backdoor smoothing}. To quantify backdoor smoothing, we define a
measure that evaluates the uncertainty associated to the predictions of a
classifier around the input samples.
Our experiments show that smoothness increases when the trigger is added to
the input samples, and that this phenomenon is more pronounced for more
successful attacks.
We also provide preliminary evidence that backdoor triggers are not the only
smoothing-inducing patterns, but that also other artificial patterns can be
detected by our approach, paving the way towards understanding the limitations
of current defenses and designing novel ones.