Automatic spelling correction rule extraction and application for spoken-style Korean text

  • Jeung Hyun Byun*
  • , Hae Chang Rim
  • , So Young Park
  • *Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    4 Citations (Scopus)

    Abstract

    Nowadays, spoken-style text is prevailing because lots of information are being written in spoken-style such as Short-Message-Service(SMS) messages. However, the spoken-style text contains more spelling errors than the traditional written-style text. In this paper, we propose a rule-based spelling correction system which can automatically extract spelling correction rules from the correction corpus and apply extracted rules to spelling errors of input sentences. In order to preserve both high precision and high recall, we devise a candidate-elimination algorithm which determines appropriate context size of spelling correction rules based on rule accuracy. Experimental results showed that the proposed system can extract 42,537 spelling correction rules and apply the rules to correct spelling errors on the test corpus and thus, the rate of precision is increased from 31.08% to 79.04% on the basis of message unit.

    Original languageEnglish
    Title of host publicationProceedings - ALPIT 2007 6th International Conference on Advanced Language Processing and Web Information Technology
    Pages195-199
    Number of pages5
    DOIs
    Publication statusPublished - 2007
    Event6th International Conference on Advanced Language Processing and Web Information Technology, ALPIT 2007 - Luoyang, Henan, China
    Duration: 2007 Aug 222007 Aug 24

    Publication series

    NameProceedings - ALPIT 2007 6th International Conference on Advanced Language Processing and Web Information Technology

    Other

    Other6th International Conference on Advanced Language Processing and Web Information Technology, ALPIT 2007
    Country/TerritoryChina
    CityLuoyang, Henan
    Period07/8/2207/8/24

    ASJC Scopus subject areas

    • General Computer Science
    • Information Systems

    Fingerprint

    Dive into the research topics of 'Automatic spelling correction rule extraction and application for spoken-style Korean text'. Together they form a unique fingerprint.

    Cite this