Robust Spoofed Speech Detection via Temporal Pyramid Modeling

TOP Literature Database Robust Spoofed Speech Detection via Temporal Pyramid Modeling

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2606.16837

PDF

https://arxiv.org/pdf/2606.16837

Paper Information

Author: Mahtab Masoudi Nezhad,Nima Karimian
Published: 6-16-2026
Affiliation: Lane Department of Computer Science and Electrical Engineering, West Virginia University
Country: United States of America
Conference

Labels Estimated by AI

Speech Recognition Technology Dataset Generation Data Collection

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Spoofed speech detection is increasingly challenged by realistic synthesis, voice conversion, and replay attacks, with cross-dataset generalization remaining a major limitation. This work we propose a Temporal Pyramid Adapter that utilize parallel temporal convolutions with varying receptive fields to capture multi-scale spoofing cues, ranging from local artifacts to global prosodic irregularities. We also integrated self-supervised XLS-R representations combined with front-end adapters, including Mel, Sinc, and a Temporal Pyramid design for multi-scale temporal modeling. The proposed model is evaluated cross multiple benchmark including ASVspoof 2017, ASVspoof 2021 (DF/LA), PartialSpoof, DiffSSD, and multilingual HQ-MPSD datasets. Experimental results demonstrate that Temporal Pyramid model obtained AUC of 99.24% and a EER of 3.87% on the PartialSpoof database, which is significantly outperforming the base model and several SOTA baseline such as LCNN-BLSTM (9.87% EER) and TRACE (8.08% EER). Additionally, multilingual evaluations confirm that while spoofing artifact are independent from language. While self-supervised representations improve robustness, performance degrades under domain and language shifts, highlighting the need for better adaptation and calibration strategies.

External Datasets

ASVspoof 2017

ASVspoof 2021 (DF/LA)

PartialSpoof

DiffSSD

HQ-MPSD