Date of Original Version



Conference Proceeding

Journal Title

IEEE Spoken Language Technology Workshop (SLT)

First Page


Last Page


Rights Management

© 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

Abstract or Description

We experiment with active learning for speech recognition in the context of accent adaptation. We adapt a source recognizer on the target accent by selecting a relatively small, matched subset of utterances from a large, untranscribed and multi-accented corpus for human transcription. Traditionally, active learning in speech recognition has relied on uncertainty based sampling to choose the most informative data for manual labeling. Such an approach doesn't include explicit relevance criterion during data selection, which is crucial for choosing utterances to match the target accent, from datasets with wide-ranging speakers of different accents. We formulate a cross-entropy based relevance measure to complement uncertainty based sampling for active learning to aid accent adaptation. We evaluate the algorithm on two different setups for Arabic and English accents and show that our approach performs favorably to conventional data selection. We analyze the results to show the effectiveness of our approach in finding the most relevant subset of utterances for improving the speech recognizer on the target accent.





Published In

IEEE Spoken Language Technology Workshop (SLT), 360-365.