VoiceWukong: Benchmarking Deepfake Voice Detection

wav2vec 2.0: A framework for self-supervised learning of speech representations

Alexei Baevski, Henry Zhou, Abdelrahman Mohamed, Michael Auli

Published: 2020

Better speech synthesis through scaling

James Betker

Published: 2023

2017 20th conference of the oriental chapter of the international coordinating committee on speech databases and speech I/O systems and assessment (O-COCOSDA)

Aishell-1: An open-source mandarin speech corpus and a speech recognition baseline

Hui Bu, Jiayu Du, Xingyu Na, Bengu Wu, Hao Zheng

Published: 2017

2010 7th International Symposium on Chinese Spoken Language Processing

Speaker verification against synthetic speech

Lian-Wu Chen, Wu Guo, Li-Rong Dai

Published: 2010

Rawbmamba: End-to-end bidirectional state space model for audio deepfake detection

Yujie Chen, Jiangyan Yi, Jun Xue, Chenglong Wang, Xiaohui Zhang, Shunbo Dong, Siding Zeng, Jianhua Tao, Lv Zhao, Cunhang Fan

Published: 2024

2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)

Replay detection using cqt-based modified group delay feature and resnewt network in asvspoof 2019

Xingliang Cheng, Mingxing Xu, Thomas Fang Zheng

Published: 2019

Diff-hiervc: Diffusion-based hierarchical voice conversion with robust pitch generation and masked prior for zero-shot speaker adaptation

Ha-Yeong Choi, Sang-Hoon Lee, Seong-Whan Lee

Published: 2023

Proceedings of the AAAI Conference on Artificial Intelligence

Dddm-vc: Decoupled denoising diffusion models with disentangled representation and prior mixup for verified robust voice conversion

Ha-Yeong Choi, Sang-Hoon Lee, Seong-Whan Lee

Published: 2024

Qwen2-audio technical report

Yunfei Chu, Jin Xu, Qian Yang, Haojie Wei, Xipin Wei, Zhifang Guo, Yichong Leng, Yuanjun Lv, Jinzheng He, Junyang Lin

Published: 2024

ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing

Deepfake speech detection through emotion recognition: a semantic approach

Emanuele Conti, Davide Salvi, Clara Borrelli, Brian Hosler, Paolo Bestagini, Fabio Antonacci, Augusto Sarti, Matthew C Stamm, Stefano Tubaro

Published: 2022

Proc. Interspeech

Synthetic speech discrimination using pitch pattern statistics derived from image analysis

Phillip L De Leon, Bryan Stewart, Junichi Yamagishi

Published: 2012

IEEE Transactions on Dependable and Secure Computing

Towards benchmarking and evaluating deepfake detection

Jingyi Deng, Chenhao Lin, Pengbin Hu, Chao Shen, Qian Wang, Qi Li, Qiming Li

Published: 2024

ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing

Samo: Speaker attractor multi-center one-class learning for voice anti-spoofing

Siwen Ding, You Zhang, Zhiyao Duan

Published: 2023

ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing

Bts-e: Audio deepfake detection using breathing-talking-silence encoder

Thien-Phuc Doan, Long Nguyen-Vu, Souhwan Jung, Kihun Hong

Published: 2023

Applied Sciences

A review of time-scale modification of music signals

Jonathan Driedger, Meinard Müller

Published: 2016

Wavefake: A data set to facilitate audio deepfake detection

Joel Frank, Lea Schönherr

Aishell-4: An open source dataset for speech enhancement, separation, recognition and speaker diarization in conference scenario

Yihui Fu, Luyao Cheng, Shubo Lv, Yukai Jv, Yuxiang Kong, Zhuo Chen, Yanxin Hu, Lei Xie, Jian Wu, Hui Bu

IEEE transactions on pattern analysis and machine intelligence

Res2net: A new multi-scale backbone architecture

Shang-Hua Gao, Ming-Ming Cheng, Kai Zhao, Xin-Yu Zhang, Ming-Hsuan Yang, Philip Torr

Published: 2019

Partially-connected differentiable architecture search for deepfake and spoofing detection

Wanying Ge, Michele Panariello, Jose Patino, Massimiliano Todisco, Nicholas Evans

Raw differentiable architecture search for speech deepfake and spoofing detection

Wanying Ge, Jose Patino, Massimiliano Todisco, Nicholas Evans

Quickvc: Any-to-many voice conversion using inverse short-time fourier transform for faster conversion

Houjian Guo, Chaoran Liu, Carlos Toshinori Ishi, Hiroshi Ishiguro

Large language models for software engineering: A systematic literature review

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun

Published: 2016

arXiv

X. Hou, Y. Zhao, Y. Liu, Z. Yang, K. Wang, L. Li, X. Luo, D. Lo, J. Grundy, H. Wang

IEEE Signal Processing Letters

Towards end-to-end synthetic speech detection

Guang Hua, Andrew Beng Jin Teoh, Haijian Zhang

Published: 2021

ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Aasist: Audio anti-spoofing using integrated spectro-temporal graph attention networks

Jee-weon Jung, Hee-Soo Heo, Hemlata Tak, Hye-jin Shim, Joon Son Chung, Bong-Jin Lee, Ha-Jin Yu, Nicholas Evans

Published: 2022

Proc. Interspeech

Improved rawnet with feature map scaling for text-independent speaker verification using raw waveforms

Jee-weon Jung, Seung-bin Kim, Hye-jin Shim, Ju-ho Kim, Ha-Jin Yu

Published: 2020

Sasv 2022: The first spoofing-aware speaker verification challenge

Jee-weon Jung, Hemlata Tak, Hye-jin Shim, Hee-Soo Heo, Bong-Jin Lee, Soo-Whan Chung, Ha-Jin Yu, Nicholas Evans, Tomi Kinnunen

How deep are the fakes? focusing on audio deepfake: A survey

Zahra Khanjani, Gabrielle Watson, Vandana P Janeja

Frontiers in Big Data

Audio deepfakes: A survey

Zahra Khanjani, Gabrielle Watson, Vandana P Janeja

Published: 2023

Conditional variational autoencoder with adversarial learning for end-to-end text-to-speech

Jaehyeon Kim, Jungil Kong, Juhee Son

Published: 2021

ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing

Phase-aware spoof speech detection based on res2net with phase network

Juntae Kim, Sung Min Ban

Published: 2023

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

A continual deepfake detection benchmark: Dataset, methods, and essentials

Chuqiao Li, Zhiwu Huang, Danda Pani Paudel, Yabin Wang, Mohamad Shahbazi, Xiaopeng Hong, Luc Van Gool

Published: 2023

Styletts 2: Towards human-level text-to-speech through style diffusion and adversarial training with large speech language models

Yinghao Aaron Li, Cong Han, Vinay S. Raghavan, Gavin Mischler, Nima Mesgarani

Published: 2023

Starganv2-vc: A diverse, unsupervised, non-parallel framework for natural-sounding voice conversion

Yinghao Aaron Li, Ali Zare, Nima Mesgarani

Published: 2021

Darts: Differentiable architecture search

Hanxiao Liu, Karen Simonyan, Yiming Yang

Published: 2018

Diffgan-tts: High-fidelity and efficient text-to-speech with denoising diffusion gans

Songxiang Liu, Dan Su, Dong Yu

Published: 2022

Applied Computer Science

Novel technique of customizing the audio fade-out shape

Lucian Lup¸sa-Tataru

Published: 2018

Applied Computer Science

Implementing the fade-in audio effect for real-time computing

Lucian Lup¸sa-Tataru

Published: 2019

Magicdata mandarin chinese read speech corpus

Magic Data Technology Co., Ltd.

Published: 2019

ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing

The vicomtech audio deepfake detection system based on wav2vec2 for the 2022 add challenge

Juan M Martín-Doñas, Aitor Álvarez

Published: 2022

INTERSPEECH

Optimization of false acceptance/rejection rates and decision threshold for end-to-end text-dependent speaker verification systems

Victoria Mingote, Antonio Miguel, Dayana Ribas, Alfonso Ortega Giménez, Eduardo Lleida

Published: 2019

Does audio deepfake detection generalize?

Nicolas M Müller, Pavel Czempin, Franziska Dieckmann, Adam Froghyar, Konstantin Böttinger

Interspeech

Speaker recognition-assisted robust audio deepfake detection

Jiahui Pan, Shuai Nie, Hui Zhang, Shulin He, Kanghao Zhang, Shan Liang, Xueliang Zhang, Jianhua Tao

Published: 2022

IEEE Access

Deepfake generation and detection: Case study and challenges

Yogesh Patel, Sudeep Tanwar, Rajesh Gupta, Pronaya Bhattacharya, Innocent Ewean Davidson, Royi Nyameko, Srinivas Aluvala, Vrince Vimal

Published: 2023

Deepfake generation and detection: A benchmark and survey

Gan Pei, Jiangning Zhang, Menghan Hu, Zhenyu Zhang, Chengjie Wang, Yunsheng Wu, Guangtao Zhai, Jian Yang, Chunhua Shen, Dacheng Tao

Published: 2024

Proceedings of the 23rd ACM international conference on Multimedia

Esc: Dataset for environmental sound classification

Karol J Piczak

Published: 2015

Openvoice: Versatile instant voice cloning

Zengyi Qin, Wenliang Zhao, Xumin Yu, Xin Sun

2019 International Conference on Speech Technology and Human-Computer Dialogue (SpeD)

For: A dataset for synthetic speech detection

Ricardo Reimao, Vassilios Tzerpos

Published: 2019

Aishell-3: A multi-speaker mandarin tts corpus and the baselines

Yao Shi, Hui Bu, Xin Xu, Shaoji Zhang, Ming Li

Published: 2020

Jsut corpus: free large-scale japanese speech corpus for end-to-end speech synthesis

Ryosuke Sonobe, Shinnosuke Takamichi, Hiroshi Saruwatari

Published: 2017

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Ai-synthesized voice detection using neural vocoder artifacts

Chengzhe Sun, Shan Jia, Shuwei Hou, Siwei Lyu

Published: 2023

End-to-end spectro-temporal graph attention networks for speaker verification anti-spoofing and speech deepfake detection

Hemlata Tak, Jee-weon Jung, Jose Patino, Madhu Kamble, Massimiliano Todisco, Nicholas Evans

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing

Rawboost: A raw data boosting and augmentation method applied to automatic speaker verification anti-spoofing

Hemlata Tak, Madhu Kamble, Jose Patino, Massimiliano Todisco, Nicholas Evans

Published: 2022

arxiv

被引用数 1

End-to-end anti-spoofing with RawNet2

Hemlata Tak, Jose Patino, Massimiliano Todisco, Andreas Nautsch, Nicholas Evans, Anthony Larcher

Published: 2020.11.3

Spoofing countermeasures aim to protect automatic speaker verification systems from attempts to manipulate their reliability with the use of spoofed speech signals. While results from the most recent ASVspoof 2019 evaluation show great potential to detect most forms of attack, some continue to evade detection. This paper reports the first application of RawNet2 to anti-spoofing. RawNet2 ingests raw audio and has potential to learn cues that are not detectable using more traditional countermeasure solutions. We describe modifications made to the original RawNet2 architecture so that it can be applied to anti-spoofing. For A17 attacks, our RawNet2 systems results are the second-best reported, while the fusion of RawNet2 and baseline countermeasures gives the second-best results reported for the full ASVspoof 2019 logical access condition. Our results are reproducible with open source software.

ディープフェイクの検知音声認識プロセスモデル評価

Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation

Hemlata Tak, Massimiliano Todisco, Xin Wang, Jee-weon Jung, Junichi Yamagishi, Nicholas Evans

Graph attention networks for anti-spoofing

Hemlata Tak, Jee weon Jung, Jose Patino, Massimiliano Todisco, Nicholas Evans

Published: 2021

2016 IEEE International conference on acoustics, speech and signal processing (ICASSP)

Spoofing detection from a feature representation perspective

Xiaohai Tian, Zhizheng Wu, Xiong Xiao, Eng Siong Chng, Haizhou Li

Published: 2016

Asvspoof 2019: Future horizons in spoofed and fake audio detection

Massimiliano Todisco, Xin Wang, Ville Vestman, Md Sahidullah, Héctor Delgado, Andreas Nautsch, Junichi Yamagishi, Nicholas Evans, Tomi Kinnunen, Kong Aik Lee

Published: 2019

Proc. ASVspoof 2021 Workshop

Stc antispoofing systems for the asvspoof2021 challenge

Anton Tomilov, Aleksei Svishchev, Marina Volkova, Artem Chirkovskiy, Alexander Kondratev, Galina Lavrentyeva

Published: 2021

Superseded-cstr vctk corpus: English multi-speaker corpus for cstr voice cloning toolkit

Christophe Veaux, Junichi Yamagishi, Kirsten MacDonald

Published: 2016

FALA workshop

Speaker verification performance degradation against spoofing and tampering attacks

Jesús Villalba, Eduardo Lleida

Published: 2010

IEEE INFOCOM 2019-IEEE Conference on Computer Communications

Voicepop: A pop noise based anti-spoofing system for voice authentication on smartphones

Qian Wang, Xiu Lin, Man Zhou, Yanjiao Chen, Cong Wang, Qi Li, Xiangyang Luo

Published: 2019

Proceedings of the 28th ACM International Conference on Multimedia

Deepsonar: Towards effective and robust detection of ai-synthesized fake voices

Run Wang, Felix Juefei-Xu, Yihao Huang, Qing Guo, Xiaofei Xie, Lei Ma, Yang Liu

Published: 2020

A comparative study on recent neural spoofing countermeasures for synthetic speech detection

Xin Wang, Junich Yamagishi

Audio anti-spoofing using a simple attention module and joint optimization based on additive angular margin loss and meta-learning

Zhenyu Wang, John HL Hansen

Applied Sciences

A history of audio effects

Thomas Wilmering, David Moffat, Alessia Milo, Mark B Sandler

Published: 2020

Clad: Robust audio deepfake detection against manipulation attacks with contrastive learning

Haolin Wu, Jing Chen, Ruiying Du, Cong Wu, Kun He, Xingcan Shang, Hao Ren, Guowen Xu

Published: 2024

ASVspoof 2021 Workshop-Automatic Speaker Verification and Spoofing Coutermeasures Challenge

Asvspoof 2021: accelerating progress in spoofed and deepfake speech detection

Junichi Yamagishi, Xin Wang, Massimiliano Todisco, Md Sahidullah, Jose Patino, Andreas Nautsch, Xuechen Liu, Kong Aik Lee, Tomi Kinnunen, Nicholas Evans

Published: 2021

ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing

Add 2022: the first audio deep synthesis detection challenge

Jiangyan Yi, Ruibo Fu, Jianhua Tao, Shuai Nie, Haoxin Ma, Chenglong Wang, Tao Wang, Zhengkun Tian, Ye Bai, Cunhang Fan

Published: 2022

Add 2023: the second audio deepfake detection challenge

Jiangyan Yi, Jianhua Tao, Ruibo Fu, Xinrui Yan, Chenglong Wang, Tao Wang, Chu Yuan Zhang, Xiaohui Zhang, Yan Zhao, Yong Ren