These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
The training phase of machine learning models is a delicate step, especially
in cybersecurity contexts. Recent research has surfaced a series of insidious
training-time attacks that inject backdoors in models designed for security
classification tasks without altering the training labels. With this work, we
propose new techniques that leverage insights in cybersecurity threat models to
effectively mitigate these clean-label poisoning attacks, while preserving the
model utility. By performing density-based clustering on a carefully chosen
feature subspace, and progressively isolating the suspicious clusters through a
novel iterative scoring procedure, our defensive mechanism can mitigate the
attacks without requiring many of the common assumptions in the existing
backdoor defense literature. To show the generality of our proposed mitigation,
we evaluate it on two clean-label model-agnostic attacks on two different
classic cybersecurity data modalities: network flows classification and malware
classification, using gradient boosting and neural network models.