Date of Original Version



Technical Report

Rights Management

All Rights Reserved

Abstract or Description

Abstract: "We describe a new formalism for word morphology. Our model views word generation as a random walk on a trellis of units where each unit is a set of (short) strings. The model naturally incorporates segmentation of words into morphemes. We capture the statistics of unit generation using a probabilistic suffix tree (PST) which is a variant of variable length Markov models. We present an efficient algorithm that learns a PST over the units whose output is a compact stochastic representation of morphological structure. We demonstrate the applicability of our approach by using the model in an allomorphy decision problem."