TOP Literature Database Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama's voice using GAN, WaveNet and low-quality found data
arxiv
Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama's voice using GAN, WaveNet and low-quality found data
AI Security Portal bot
Information in the literature database is collected automatically.
These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Thanks to the growing availability of spoofing databases and rapid advances
in using them, systems for detecting voice spoofing attacks are becoming more
and more capable, and error rates close to zero are being reached for the
ASVspoof2015 database. However, speech synthesis and voice conversion paradigms
that are not considered in the ASVspoof2015 database are appearing. Such
examples include direct waveform modelling and generative adversarial networks.
We also need to investigate the feasibility of training spoofing systems using
only low-quality found data. For that purpose, we developed a generative
adversarial network-based speech enhancement system that improves the quality
of speech data found in publicly available sources. Using the enhanced data, we
trained state-of-the-art text-to-speech and voice conversion models and
evaluated them in terms of perceptual speech quality and speaker similarity.
The results show that the enhancement models significantly improved the SNR of
low-quality degraded data found in publicly available sources and that they
significantly improved the perceptual cleanliness of the source speech without
significantly degrading the naturalness of the voice. However, the results also
show limitations when generating speech with the low-quality found data.