figshare
Browse
W04-0409.pdf (206.96 kB)

Integrating Morphology with Multi-word Expression Processing in Turkish

Download (206.96 kB)
journal contribution
posted on 2004-07-01, 00:00 authored by Kemal OflazerKemal Oflazer, Ozlem Cetinoglu, Bilge Say
This paper describes a multi-word expression processor for preprocessing Turkish text for various language engineering applications. In addition to the fairly standard set of lexicalized collocations and multi-word expressions such as named-entities, Turkish uses a quite wide range of semi-lexicalized and non-lexicalized collocations. After an overview of relevant aspects of Turkish, we present a description of the multi-word expressions we handle. We then summarize the computational setting in which we employ a series of components for tokenization, morphological analysis, and multi-word expression extraction. We finally present results from runs over a large corpus and a small gold-standard corpus.

History

Publisher Statement

Published in Second ACL Workshop on Multiword Expressions: Integrating Processing, July 2004, pp. 64-71, Barcelona, Spain

Date

2004-07-01

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC