Date of Original Version
Object Categorization: Computer and Human Vision Perspective. Ed. by Sven Dickinson, Ales Leonardis, Bernt Schiele, Michael Tarr, Cambridge University Press, 2009
Abstract or Table of Contents
Features associated with an object or its surfaces in natural scenes tend to vary coherently in space and time. In psychological literature, these coherent covariations have been described as important for neural systems to acquire models of objects and object categories. From a statistical inference perspective, such coherent covariation can provide a mechanism to learn statistical priors in natural scenes that are useful for probabilistic inference. In this article, we present some neurophysiological experimental observations in the early visual cortex that provide insights into how correlation structures in visual scenes are being encoded by neuronal tuning and connections among neurons. The key insight is that correlated structures in visual scenes result in correlated neuronal activities, which shapes the tuning properties of individual neurons and the connections between them, embedding Gestalt-related computational constraints or priors for surface inference. Extending these concepts to the inferotemporal cortex suggests a representational framework that is distinct from the traditional feed-forward hierarchy of invariant object representation and recognition. In this framework, lateral connections among view-based neurons, learned from the temporal association of the object views observed over time, can form a linked graph structure with local dependency, akin to a dense aspect graph in computer vision. This web-like graph allows view-invariant object representation to be created using sparse feed-forward connections, while maintaining the explicit representation of the different views. Thus, it can serve as an effective prior model for generating predictions of future incoming views to facilitate object inference.