The problem of determining the physical and semantic structure of an extended video sequence is essential for providing appropriate processing, indexing and retrieval capabilities for video databases. In this paper, we describe a novel technique which reduces a sequence of MPEG encoded video frames to a trail of points in a low dimensional space. In thii space, we can cluster frames, analyze transitions between clusters and compute properties of the resulting trail. By classifying portions of the trail as either stationary or transitional, we are able to detect gradual edits between shots. Furthermore, tracking the interaction of clusters over time, we lay the groundwork for the complete analysis and representation of the video’s physical and semantic structure.





Proceedings of the Fifth ACM international Conference on Multimedia, MULTIMEDIA '97. , 335-346.