Deep neural networks usually require large labeled datasets for training to
achieve state-of-the-art performance in many tasks, such as image
classification and natural language processing. Although a lot of data is
created each day by active Internet users, most of these data are unlabeled and
are vulnerable to data poisoning attacks. In this paper, we develop an
efficient active learning method that requires fewer labeled instances and
incorporates the technique of adversarial retraining in which additional
labeled artificial data are generated without increasing the budget of the
labeling. The generated adversarial examples also provide a way to measure the
vulnerability of the model. To check the performance of the proposed method
under an adversarial setting, i.e., malicious mislabeling and data poisoning
attacks, we perform an extensive evaluation on the reduced CIFAR-10 dataset,
which contains only two classes: airplane and frog. Our experimental results
demonstrate that the proposed active learning method is efficient for defending
against malicious mislabeling and data poisoning attacks. Specifically, whereas
the baseline active learning method based on the random sampling strategy
performs poorly (about 50%) under a malicious mislabeling attack, the proposed
active learning method can achieve the desired accuracy of 89% using only
one-third of the dataset on average.