Adversarial data poisoning is an effective attack against machine learning
and threatens model integrity by introducing poisoned data into the training
dataset. So far, it has been studied mostly for classification, even though
regression learning is used in many mission critical systems (such as dosage of
medication, control of cyber-physical systems and managing power supply).
Therefore, in the present research, we aim to evaluate all aspects of data
poisoning attacks on regression learning, exceeding previous work both in terms
of breadth and depth. We present realistic scenarios in which data poisoning
attacks threaten production systems and introduce a novel black-box attack,
which is then applied to a real-word medical use-case. As a result, we observe
that the mean squared error (MSE) of the regressor increases to 150 percent due
to inserting only two percent of poison samples. Finally, we present a new
defense strategy against the novel and previous attacks and evaluate it
thoroughly on 26 datasets. As a result of the conducted experiments, we
conclude that the proposed defence strategy effectively mitigates the
considered attacks.