Date of Original Version




Abstract or Description

Spoken document retrieval is defined as information retrieval from transcribed spoken audio, and the basic approach is described. Research experiments demonstrate that information retrieval from transcribed speech can be done almost as effectively as from corresponding correct text. The retrieval performance is relatively immune to moderate amounts of speech recognition errors in the transcripts. Retrieval strategies using query expansion are valuable in mitigating effects of speech recognition errors. These can be supplemented by exploiting information that may be available from large parallel text corpora through automatic document and query augmentation. There is ongoing research in combining multilingual retrieval with spoken document retrieval and video retrieval, which requires spoken document retrieval together with video or image analysis. Some examples of successful applications of spoken document technology in practice are presented.