Skip to main navigation Skip to search Skip to main content

Deep Neural Network Based Discriminative Training for I-Vector/PLDA Speaker Verification

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In the studies of i-vector based speaker verification, the discriminative training of probabilistic linear discriminative analysis (PLDA) model has been proven to be an effective way to improve performance. This paper focuses on using a deep neural network (DNN) to strengthen the original discriminatively trained classifiers by its strong capability of nonlinear modeling representation. We first propose a deep neural network based dimensionality reduction model to replace the linear discriminant analysis (LDA) process, and then a discriminative training algorithm is also proposed to jointly optimize the network and PLDA scoring function under single discriminative criterion. Our experiments show that performance improvements are achieved in the male trials of short2-short3 core data set of NIST SRE08.

Original languageEnglish
Title of host publication2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages5354-5358
Number of pages5
ISBN (Print)9781538646588
DOIs
StatePublished - 10 Sep 2018
Externally publishedYes
Event2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Calgary, Canada
Duration: 15 Apr 201820 Apr 2018

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2018-April
ISSN (Print)1520-6149

Conference

Conference2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018
Country/TerritoryCanada
CityCalgary
Period15/04/1820/04/18

Keywords

  • DNN
  • Discriminative training
  • PLDA
  • Speaker verification

Fingerprint

Dive into the research topics of 'Deep Neural Network Based Discriminative Training for I-Vector/PLDA Speaker Verification'. Together they form a unique fingerprint.

Cite this