Date of Original Version
Abstract or Description
We examine how to identify video shots with at least two humans using only detected face information. While face detection is much more reliable than shape based people classification in broadcast video, one particular difficulty is that, when there are several humans in an image, the accuracy of face detection is usually significantly degraded, which leads to poor performance in identifying shots of 'people'. Furthermore, while our standard face detector works from individual still images, we propose using the statistics of face information of images within a whole shot as additional evidence in deciding whether or not a video shot belongs to the 'people' category. Empirically, we studied which statistics of face information are more informative than others and how to combine different statistics together in order to achieve better prediction.