Date of Original Version




Rights Management

The final publication is available at Springer via

Abstract or Description

Autonomous mobile robots equipped with visual perception aim at detecting objects towards intelligently acting in their environments. Such real-time vision processing continues to offer challenges in terms of getting the object detection algorithm to process images at the frame rate of live video. Our work contributes a novel algorithm that is capable of making use of all the frames, where each frame is efficiently processed as a “continuation” of the processing of the previous frames. From the 2D camera images as captured by the robot, our algorithm, Wave3D, maintains 3D hypotheses of the presence of the objects in the real 3D world relative to the robot. The algorithm does not ignore any new frame and continues its object detection on each frame by projecting the 3D hypotheses back into the 2D images to focus the object detection. We can view Wave3D as validating the 3D hypotheses in each of the images in the live video. Wave3D outperforms the static single-image classical approach in processing effort and detection accuracy, in particular for moving objects. In addition, the resulting reduced vision processing time translates into more computation available for task-related behaviors, as greatly needed in situated autonomous intelligent robot agents. We conduct targeted experiments using the humanoid NAO robot that illustrate the effectiveness of Wave3D.





Published In

Progress in Artificial Intelligence, 1, 4, 259-265.