A syllable based word recognition model for Korean noun extraction

Do Gil Lee, Hae-Chang Rim, Heui Seok Lim

    Research output: Contribution to journalConference articlepeer-review

    5 Citations (Scopus)

    Abstract

    Noun extraction is very important for many NLP applications such as information retrieval, automatic text classification, and information extraction. Most of the previous Korean noun extraction systems use a morphological analyzer or a Part-of-Speech (POS) tagger. Therefore, they require much of the linguistic knowledge such as morpheme dictionaries and rules (e.g. morphosyntactic rules and morphological rules). This paper proposes a new noun extraction method that uses the syllable based word recognition model. It finds the most probable syllable-tag sequence of the input sentence by using automatically acquired statistical information from the POS tagged corpus and extracts nouns by detecting word boundaries. Furthermore, it does not require any labor for constructing and maintaining linguistic knowledge. We have performed various experiments with a wide range of variables influencing the performance. The experimental results show that without morphological analysis or POS tagging, the proposed method achieves comparable performance with the previous methods.

    Original languageEnglish
    JournalProceedings of the Annual Meeting of the Association for Computational Linguistics
    Volume2003-July
    Publication statusPublished - 2003
    Event41st Annual Meeting of the Association for Computational Linguistics, ACL 2003 - Sapporo, Japan
    Duration: 2003 Jul 72003 Jul 12

    Bibliographical note

    Publisher Copyright:
    © ACL 2003.All right reserved.

    ASJC Scopus subject areas

    • Computer Science Applications
    • Linguistics and Language
    • Language and Linguistics

    Fingerprint

    Dive into the research topics of 'A syllable based word recognition model for Korean noun extraction'. Together they form a unique fingerprint.

    Cite this