Publication Details

End-to-End DNN Based Speaker Recognition Inspired by i-Vector and PLDA

ROHDIN Johan A., SILNOVA Anna, DIEZ Sánchez Mireia, PLCHOT Oldřich, MATĚJKA Pavel and BURGET Lukáš. End-to-End DNN Based Speaker Recognition Inspired by i-Vector and PLDA. In: Proceedings of ICASSP. Calgary: IEEE Signal Processing Society, 2018, pp. 4874-4878. ISBN 978-1-5386-4658-8.
Czech title
End-to-end DNN rozpoznávání mluvčího inspirované i-vektory a PLDA
Type
conference paper
Language
english
Authors
URL
Keywords

Speaker verification, DNN, end-to-end

Abstract

Recently, several end-to-end speaker verification systems based on deep neural networks (DNNs) have been proposed. These systems have been proven to be competitive for text-dependent tasks as well as for text-independent tasks with short utterances. However, for text-independent tasks with longer utterances, end-to-end systems are still outperformed by standard i-vector + PLDA systems. In this work, we develop an end-to-end speaker verification system that is initialized to mimic an i-vector + PLDA baseline. The system is then further trained in an end-to-end manner but regularized so that it does not deviate too far from the initial system. In this way we mitigate overfitting which normally limits the performance of endto- end systems. The proposed system outperforms the i-vector + PLDA baseline on both long and short duration utterances.

Published
2018
Pages
4874-4878
Proceedings
Proceedings of ICASSP
Conference
IEEE International Conference on Acoustics, Speech and Signal Processing, Calgary, CA
ISBN
978-1-5386-4658-8
Publisher
IEEE Signal Processing Society
Place
Calgary, CA
DOI
UT WoS
000446384605009
EID Scopus
BibTeX
@INPROCEEDINGS{FITPUB11724,
   author = "A. Johan Rohdin and Anna Silnova and Mireia S\'{a}nchez Diez and Old\v{r}ich Plchot and Pavel Mat\v{e}jka and Luk\'{a}\v{s} Burget",
   title = "End-to-End DNN Based Speaker Recognition Inspired by i-Vector and PLDA",
   pages = "4874--4878",
   booktitle = "Proceedings of ICASSP",
   year = 2018,
   location = "Calgary, CA",
   publisher = "IEEE Signal Processing Society",
   ISBN = "978-1-5386-4658-8",
   doi = "10.1109/ICASSP.2018.8461958",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/11724"
}
Back to top