ParaMor: Finding Paradigms across Morphology

Monson, Christian; Carbonell, Jaime G.; Lavie, Alon; Levin, Lorraine

doi:10.1184/R1/6608282.v1

file.pdf (106.9 kB)

ParaMor: Finding Paradigms across Morphology

journal contribution

posted on 2005-01-01, 00:00 authored by Christian Monson, Jaime G. Carbonell, Alon LavieAlon Lavie, Lorraine LevinLorraine Levin

Our algorithm, ParaMor, fared well in Morpho Challenge 2007 (Kurimo et al., 2007), a peer operated competition pitting against one another algorithms designed to discover the morphological structure of natural languages from nothing more than raw text. ParaMor constructs sets of affixes closely mimicking the paradigms of a language, and, with these structures in hand, annotates word forms with morpheme boundaries. Of the four language tracks in Morpho Challenge 2007, we entered ParaMor in English and German. Morpho Challenge 2007 evaluated systems on their precision, recall, and balanced F₁ at identifying morphological processes, whether those processes mark derivational morphology or inflectional features. In English, ParaMor’s balanced precision and recall outperform at F₁ an already sophisticated baseline induction algorithm, Morfessor (Creutz, 2006). ParaMor placed fourth in English overall. In German, ParaMor suffers from a low morpheme recall. But combining ParaMor’s analyses with analyses from Morfessor results in a set of analyses that outperform either algorithm alone, and that place first in F₁ among all algorithms submitted to Morpho Challenge 2007

History

Publisher Statement

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proﬁt or commercial advantage and that copies bear this notice and the full citation on the ﬁrst page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior speciﬁc permission and/or a fee.

Date

2005-01-01

Usage metrics

Keywords

Unsupervised Natural Language Morphology Induction Paradigms

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

ParaMor: Finding Paradigms across Morphology

History

Publisher Statement

Date

Usage metrics

Categories

Keywords

Licence

Exports