Cross-Representation Transferability of Adversarial Attacks: From Spectrograms to Audio Waveforms

Authors: Karl Michel Koerich, Mohammad Esmaeilpour, Sajjad Abdoli, Alceu de Souza Britto Jr., Alessandro Lameiras Koerich | Published: 2019-10-22 | Updated: 2020-07-29

2019.10.222025.05.28

Authors: Karl Michel Koerich, Mohammad Esmaeilpour, Sajjad Abdoli, Alceu de Souza Britto Jr., Alessandro Lameiras Koerich
Published: 2019-10-22 | Updated: 2020-07-29

Source: https://arxiv.org/abs/1910.10106

PDF: https://arxiv.org/pdf/1910.10106

Labels Predicted by AI

Performance Evaluation Adversarial Transferability Adversarial Learning

Please note that these labels were automatically added by AI. Therefore, they may not be entirely accurate.
For more details, please see the About the Literature Database page.

Abstract

This paper shows the susceptibility of spectrogram-based audio classifiers to adversarial attacks and the transferability of such attacks to audio waveforms. Some commonly used adversarial attacks to images have been applied to Mel-frequency and short-time Fourier transform spectrograms, and such perturbed spectrograms are able to fool a 2D convolutional neural network (CNN). Such attacks produce perturbed spectrograms that are visually imperceptible by humans. Furthermore, the audio waveforms reconstructed from the perturbed spectrograms are also able to fool a 1D CNN trained on the original audio. Experimental results on a dataset of western music have shown that the 2D CNN achieves up to 81.87 performance drops to 12.09 achieves up to 78.29 performance drops to 27.91 the perturbed spectrograms.