Date of Original Version



Conference Proceeding

Journal Title

Proceedings of INTERSPEECH

First Page


Last Page


Rights Management

Copyright © 2011 ISCA

Abstract or Description

In this paper, we address the out-of-vocabulary (OOV) detection and recovery problem by developing three different fragment-word hybrid systems. A fragment language model (LM) and a word LM were trained separately and then combined into a single hybrid LM. Using this hybrid model, the recognizer can recognize any OOVs as fragment sequences. Different types of fragments, such as phones, subwords, and graphones were tested and compared on the WSJ 5k and 20k evaluation sets. The experiment results show that the subword and graphone hybrid systems perform better than the phone hybrid system in both 5k and 20k tasks. Furthermore, given less training data, the subword hybrid system is more preferable than the graphone hybrid system.



Published In

Proceedings of INTERSPEECH, 1913-1916.