Date of Award


Embargo Period


Degree Type


Degree Name

Doctor of Philosophy (PhD)


Robotics Institute


Fernando De la Torre


Enabling computers to understand human and animal behavior has the potential to revolutionize many areas that benefit society such as clinical diagnosis, human-computer interaction, and social robotics. Critical to the understanding of human and animal behavior, and any temporally-varying phenomenon in general, is the capability to segment, classify, and cluster time series data. This thesis proposes segment-based Support Vector Machines (Seg-SVMs), a framework for supervised, weakly-supervised, and unsupervised time series analysis. Seg-SVMs outperform state-of-the-art approaches by combining three powerful ideas: energy-based structure prediction, bag-of-words representation, and maximum-margin learning. Energy-based structure prediction provides a principled mechanism for concurrent top-down recognition and bottom-up temporal localization. Bag-of-words representation provides segment-based features that tolerate misalignment errors and are computationally efficient. Maximum-margin learning, such as SVM and Structure Output SVM, has a convex learning formulation; it produces classifiers that are discriminative and less prone to over-fitting.

In this thesis, we show how Seg-SVMs outperform state-of-the-art approaches for segmenting, classifying, and clustering human and animal behavior in video and accelerometer data of varying complexity. We illustrate these benefits in the problems of facial event detection, sequence labeling of human actions, and temporal clustering of animal behavior. In addition, the Seg-SVMs framework naturally provides solutions to two novel problems: early detection of human actions and weakly-supervised discovery of discriminative events.