Date of Original Version

7-2002

Type

Conference Proceeding

Abstract or Description

We examine multi-modal information retrieval from broadcast video where text can be read on the screen through OCR and speech recognition can be performed on the audio track. OCR and speech recognition are compared on the 2001 TREC Video Retrieval evaluation corpus. Results show that OCR is more important that speech recognition for video retrieval. OCR retrieval can further improve through dictionary-based post-processing. We demonstrate how to utilize imperfect multi-modal metadata results to benefit multi-modal information retrieval.

DOI

10.1145/544220.544252

Share

COinS
 

Published In

Proceedings of the 2nd ACM/IEEE-CS Joint Conference on Digital Libraries . JCDL '02, 160-161.