Long Term Spectral Statistics for Voice Presentation Attack DetectionReportar como inadecuado

Long Term Spectral Statistics for Voice Presentation Attack Detection - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

Idiap, 2017

Automatic speaker verification systems can be spoofed through recorded, synthetic or voice converted speech of target speakers. To make these systems practically viable, the detection of such attacks, referred to as presentation attacks, is of paramount interest. In that direction, this paper investigates two aspects: (a) a novel approach to detect presentation attacks where, unlike conventional approaches, no speech signal related assumptions are made, rather the attacks are detected by computing first order and second order spectral statistics and feeding them to a classifier, and (b) generalization of the presentation attack detection systems across databases. Our investigations on Interspeech 2015 ASVspoof challenge dataset and AVspoof dataset show that, when compared to the approaches based on conventional short-term spectral processing, the proposed approach with a linear discriminative classifier yields a better system, irrespective of whether the spoofed signal is replayed to the microphone or is directly injected into the system software process. Cross-database investigations show that neither the short-term spectral processing based approaches nor the proposed approach yield systems which are able to generalize across databases or methods of attack. Thus, revealing the difficulty of the problem and the need for further resources and research.

Reference EPFL-REPORT-226623

Autor: Muckenhirn, Hannah; Korshunov, Pavel; Magimai.-Doss, Mathew; Marcel, Sébastien

Fuente: https://infoscience.epfl.ch/record/226623?ln=en

Documentos relacionados