Korean spacing by improving viterbi segmentation

Gumwon Hong, Hae Chang Rim

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    1 Citation (Scopus)

    Abstract

    This paper presents a Korean spacing approach which employs an improved Viterbi segmentation model. Traditional Viterbi segmentation using the word imigram language model is simple and fast, but has two problems: data sparseness and impmper preference of fewer segments. To overcome these limitations, the segmentation model is extended by employing a split probability based on character bigram. Contextual information is selectively used for further resolution of spacing ambiguities without much increase of the complexity. Experimental results show that the extended model performs better than the traditional segmentation model. Futhennore, compared to the state of the art system, our approach achieves better efficiency in terms of processing time without losing significant accuracy.

    Original languageEnglish
    Title of host publicationProceedings - ALPIT 2007 6th International Conference on Advanced Language Processing and Web Information Technology
    Pages75-80
    Number of pages6
    DOIs
    Publication statusPublished - 2007
    Event6th International Conference on Advanced Language Processing and Web Information Technology, ALPIT 2007 - Luoyang, Henan, China
    Duration: 2007 Aug 222007 Aug 24

    Publication series

    NameProceedings - ALPIT 2007 6th International Conference on Advanced Language Processing and Web Information Technology

    Other

    Other6th International Conference on Advanced Language Processing and Web Information Technology, ALPIT 2007
    Country/TerritoryChina
    CityLuoyang, Henan
    Period07/8/2207/8/24

    ASJC Scopus subject areas

    • General Computer Science
    • Information Systems

    Fingerprint

    Dive into the research topics of 'Korean spacing by improving viterbi segmentation'. Together they form a unique fingerprint.

    Cite this