Date of Original Version
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
© 2013 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Abstract or Description
Eigenspace MLLR is effective for fast adaptation when the amount of adaptation data is limited, e.g., less than 5s. The general motivation is to represent the MLLR transform as a linear combination of basis matrices. In this paper, we present a framework to estimate a speaker-independent discriminative transform over the combination coefficients. This discriminative basis coefficients transform (DBCT) is learned by optimizing discriminative criteria over all the training speakers. During recognition, the ML basis coefficients for each testing speaker are firstly found, on which DBCT is applied to give the final MLLR transform discrimination ability. Experiments show that DBCT results in consistent WER reduction in unsupervised adaptation, compared with both standard ML and discriminatively trained transforms.
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 7927-7931.