Date of Original Version



Conference Proceeding

Journal Title

Proceedings of the Conference on Computational Language Learning

First Page


Last Page


Rights Management

Copyright 2014 Association for Computational Linguistics

Abstract or Description

We present a Bayesian formulation for weakly-supervised learning of a Combinatory Categorial Grammar (CCG) supertagger with an HMM. We assume supervision in the form of a tag dictionary, and our prior encourages the use of crosslinguistically common category structures as well as transitions between tags that can combine locally according to CCG’s combinators. Our prior is theoretically appealing since it is motivated by languageindependent, universal properties of the CCG formalism. Empirically, we show that it yields substantial improvements over previous work that used similar biases to initialize an EM-based learner. Additional gains are obtained by further shaping the prior with corpus-specific information that is extracted automatically from raw text and a tag dictionary



Published In

Proceedings of the Conference on Computational Language Learning, 141-150.