Date of Original Version
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
© 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Abstract or Description
Recent works have shown Neural Network based Language Models (NNLMs) to be an effective modeling technique for Automatic Speech Recognition. Prior works have shown that these models obtain lower perplexity and word error rate (WER) compared to both standard n-gram language models (LMs) and more advanced language models including maximum entropy and random forest LMs. While these results are compelling, prior works were limited to evaluating NNLMs on perplexity and word error rate. Our initial results showed that while NNLMs improved speech recognition accuracy, the improvement in keyword search was negligible. In this paper we propose alternate optimizations of NNLMs for the task of keyword search. We evaluate the performance of the proposed methods for keyword search on the Vietnamese dataset provided in phase one of the BABEL1 project and demonstrate that by penalizing low frequency words during NNLM training, keyword search metrics such as actual term weighted value (ATWV) can be improved by up to 9.3% compared to the standard training methods.
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 4888-4892.