Date of Award


Embargo Period


Degree Type


Degree Name

Doctor of Philosophy (PhD)


Robotics Institute


Martial Hebert


In this thesis, we describe a data-driven approach to leverage repositories of 3D models for scene understanding. Our ability to relate what we see in an image to a large collection of 3D models allows us to transfer information from these models, creating a rich understanding of the scene. We develop a framework for auto-calibrating a camera, rendering 3D models from the viewpoint an image was taken, and computing a similarity measure between each 3D model and an input image. We demonstrate this data-driven approach in the context of geometry estimation and show the ability to find the identities, poses and styles of objects in a scene.

We begin by presenting a proof-of-concept algorithm for matching 3D models with input images. Next, we present a series of extensions to this baseline approach. Our goals here are three-fold. First, we aim to produce more accurate reconstructions of a scene by determining both the exact style and size of objects as well as precisely localizing their positions. In addition, we aim to increase the robustness of our scene-matching approach by incorporating new features and expanding our search space to include many viewpoint hypotheses. Lastly, we address the computational challenges of our approach by presenting algorithms for more efficiently exploring the space of 3D scene hypotheses, without sacrificing the quality of results.

We conclude by presenting various applications of our geometric scene understanding approach. We start by demonstrating the effectiveness of our algorithm for traditional applications such as object detection and segmentation. In addition, we present two novel applications incorporating our geometry estimates: affordance estimation and geometryaware object insertion for photorealistic rendering.



Included in

Robotics Commons