Date of Award


Embargo Period


Degree Name

Doctor of Philosophy (PhD)


Robotics Institute


Takeo Kanade

Second Advisor

James Kuffner

Third Advisor

Paul Rybski

Fourth Advisor

Kei Okada


Society is becoming more automated with robots beginning to perform most tasks in factories
and starting to help out in home and office environments. One of the most important
functions of robots is the ability to manipulate objects in their environment. Because the
space of possible robot designs, sensor modalities, and target tasks is huge, researchers end
up having to manually create many models, databases, and programs for their speci c task,
an e ort that is repeated whenever the task changes. Given a speci cation for a robot and
a task, the presented framework automatically constructs the necessary databases and programs
required for the robot to reliably execute manipulation tasks. It includes contributions
in three major components that are critical for manipulation tasks.
The rst is a geometric-based planning system that analyzes all necessary modalities of
manipulation planning and o ers efficient algorithms to formulate and solve them. This
allows identi cation of the necessary information needed from the task and robot speci cations.
Using this set of analyses, we build a planning knowledge-base that allows informative
geometric reasoning about the structure of the scene and the robot's goals. We show how
to efficiently generate and query the information for planners.
The second is a set of efficient algorithms considering the visibility of objects in cameras
when choosing manipulation goals. We show results with several robot platforms using
grippers cameras to boost accuracy of the detected objects and to reliably complete the
tasks. Furthermore, we use the presented planning and visibility infrastructure to develop
a completely automated extrinsic camera calibration method and a method for detecting
insufficient calibration data.

The third is a vision-centric database that can analyze a rigid object's surface for stable
and discriminable features to be used in pose extraction programs. Furthermore, we show
work towards a new voting-based object pose extraction algorithm that does not rely on
2D/3D feature correspondences and thus reduces the early-commitment problem plaguing
the generality of traditional vision-based pose extraction algorithms.
In order to reinforce our theoric contributions with a solid implementation basis, we discuss
the open-source planning environment OpenRAVE, which began and evolved as a result of
the work done in this thesis. We present an analysis of its architecture and provide insight
for successful robotics software environments.