Machine Learning models are vulnerable to adversarial attacks that rely on
perturbing the input data. This work proposes a novel strategy using
Autoencoder Deep Neural Networks to defend a machine learning model against two
gradient-based attacks: The Fast Gradient Sign attack and Fast Gradient attack.
First we use an autoencoder to denoise the test data, which is trained with
both clean and corrupted data. Then, we reduce the dimension of the denoised
data using the hidden layer representation of another autoencoder. We perform
this experiment for multiple values of the bound of adversarial perturbations,
and consider different numbers of reduced dimensions. When the test data is
preprocessed using this cascaded pipeline, the tested deep neural network
classifier yields a much higher accuracy, thus mitigating the effect of the
adversarial perturbation.