Open Access Research

Multi-candidate missing data imputation for robust speech recognition

Yujun Wang* and Hugo Van hamme

Author Affiliations

Department of ESAT, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium

For all author emails, please log on.

EURASIP Journal on Audio, Speech, and Music Processing 2012, 2012:17 doi:10.1186/1687-4722-2012-17

Published: 29 May 2012

Abstract

The application of Missing Data Techniques (MDT) to increase the noise robustness of HMM/GMM-based large vocabulary speech recognizers is hampered by a large computational burden. The likelihood evaluations imply solving many constrained least squares (CLSQ) optimization problems. As an alternative, researchers have proposed frontend MDT or have made oversimplifying independence assumptions for the backend acoustic model. In this article, we propose a fast Multi-Candidate (MC) approach that solves the per-Gaussian CLSQ problems approximately by selecting the best from a small set of candidate solutions, which are generated as the MDT solutions on a reduced set of cluster Gaussians. Experiments show that the MC MDT runs equally fast as the uncompensated recognizer while achieving the accuracy of the full backend optimization approach. The experiments also show that exploiting the more accurate acoustic model of the backend does pay off in terms of accuracy when compared to frontend MDT.

Keywords:
speech recognition; constrained optimization; missing data; noise robustness