Date of Original Version
Abstract or Table of Contents
In this paper we describe integrated multimedia processing for Video Scout, a system that segments and indexes TV programs according to their audio, visual, and transcript information. Video Scout represents a future direction for personal video recorders. In addition to using electronic program guide metadata and a user profile, Scout allows the users to request specific topics within a program. For example, users can request the video clip of the President speaking from a half-hour news program.
Video Scout has three modules: (i) Video Pre-Processing, (ii) Segmentation and Indexing, and (iii) Storage and User Interface. Segmentation and Indexing, the core of the system, incorporates a Bayesian framework that integrates information from the audio, visual, and transcript (closed captions) domains. This framework uses three layers to process low, mid, and high-level multimedia information. The high-level layer generates semantic information about TV program topics. This paper describes the elements of the system and presents results from running Video Scout on real TV programs.