Show all publications

Improving Eigenvoices-Based Techniques and Smllr for Speaker Adaptation by Combining Ev and Smllr Techniques or Using Genetic Algorithms

Download PDFDownload Bibliography in Open DocumentDownload Bibliography in HTMLDownload BibTeXDownload RISDownload Bibliographical Ontology (RDF)
In Technical Report, 2005.
This paper constitutes a study of several classical and original methods for a speaker adaptation of the acoustic hidden Markov models of an automatic speech recognition system (ASRS). Most of today's real applications require that the speaker adaptation process continuously improves the performance of the underlying ASRS, as more utterances are pronounced by a new speaker. The first part of this article is dedicated to this problem. We begin by introducing the extit{Structural} EigenVoices approach (SEV). Compared to EigenVoices (EV), SEV improves the performance of an ASRS with more sentences, well beyond the point where the EV system has reached its limit. We then describe four methods that combine the advantages of extit{Structural Maximum Likelihood Linear Regression} (SMLLR) and EigenVoices-based techniques (EV or SEV). We show experimentally that one of them, SEVSMLLR, can improve the performance of an ASRS at least as significantly as SMLLR, EV, and SEV, irrespective of the amount of adaptation utterances used. The second part of our work is focused on the use of genetic algorithms for rapidly adapting acoustic models. Whereas all of the standard adaptation methods (eg SMLLR, SMAP, EV, etc.) are based on the EM procedure and thus provide a single local optimal solution, genetic algorithms are theoretically able to provide several global optimal solutions. We experimentally show that: $(1)$ genetic algorithms and EV both equivalently improve the performance of an ASRS, and $(2)$ combining genetic algorithms and EV further improves the performance of an ASRS.
Speaker adaptation Structural EigenVoices Genetic algorithms Hidden Markov Models
Publication Category:
International journal without reading committee
Copyright 2010-2019 © Laboratoire Connaissance et Intelligence Artificielle Distribu√©es - Université Bourgogne Franche-Comté - Privacy policy