Date of Original Version
Abstract or Description
We propose an approach to identify and segment objects from scenes that a person (or robot) encounters in Activities of Daily Living (ADL). Images collected in those cluttered scenes contain multiple objects. Each image provides only a partial, possibly very different view of each object. An object instance discovery program must be able to link pieces of visual information from multiple images and extract the consistent patterns. Most papers on unsupervised discovery of object models are concerned with object categories. In contrast, this paper aims at identifying and extracting regions corresponding to specific object instances, e.g., two different laptops in the laptop category. By focusing on specific instances, we enforce explicit constraints on geometric consistency (such as scale, orientation), and appearance consistency (such as color, texture and shape). Using multiple segmentations as the basic building block, our program processes a noisy "soup" of segments and extracts object models as groups of mutually consistent segments. Our approach was tested on three different types of image sets: two from indoor ADL environments and one from Flickr.com. The results demonstrate robustness of our program to severe clutter, occlusion, changes of viewpoint and interference from irrelevant images. Our approach achieves significant improvement over with two existing methods.
Computer Vision (ICCV), 2011 IEEE International Conference on, 762-769.