ParaMor: Finding Paradigms across Morphology
Date of Original Version
All Rights Reserved
Abstract or Description
Our algorithm, ParaMor, fared well in Morpho Challenge 2007 (Kurimo et al., 2007), a peer operated competition pitting against one another algorithms designed to discover the morphological structure of natural languages from nothing more than raw text. ParaMor constructs sets of affixes closely mimicking the paradigms of a language, and, with these structures in hand, annotates word forms with morpheme boundaries. Of the four language tracks in Morpho Challenge 2007, we entered ParaMor in English and German. Morpho Challenge 2007 evaluated systems on their precision, recall, and balanced F1 at identifying morphological processes, whether those processes mark derivational morphology or inflectional features. In English, ParaMor’s balanced precision and recall outperform at F1 an already sophisticated baseline induction algorithm, Morfessor (Creutz, 2006). ParaMor placed fourth in English overall. In German, ParaMor suffers from a low morpheme recall. But combining ParaMor’s analyses with analyses from Morfessor results in a set of analyses that outperform either algorithm alone, and that place first in F1 among all algorithms submitted to Morpho Challenge 2007.