Neural spelling correction: translating incorrect sentences to correct sentences for multimedia

Chanjun Park, Kuekyeng Kim, Yeong Wook Yang, Minho Kang, Heuiseok Lim

    Research output: Contribution to journalArticlepeer-review

    12 Citations (Scopus)

    Abstract

    The aim of a spelling correction task is to detect spelling errors and automatically correct them. In this paper we aim to perform the Korean spelling correction task from a machine translation perspective, allowing it to overcome the limitations of cost, time and data. Based on a sequence to sequence model, the model aligns its source sentence with an ‘error filled sentence’ and its target sentence aligned to the correct counter part. Thus, ‘translating’ the error sentence to a correct sentence. For this research, we have also proposed three new data generation methods allowing the creation of multiple spelling correction parallel corpora from just a single monolingual corpus. Additionally, we discovered that applying the Copy Mechanism not only resolves the problem of overcorrection but even improves it. For this paper, we evaluated our model upon these aspects: Performance comparisons to other models and evaluation on overcorrection. The results show the proposed model to even out-perform other systems currently in commercial use.

    Original languageEnglish
    Pages (from-to)34591-34608
    Number of pages18
    JournalMultimedia Tools and Applications
    Volume80
    Issue number26-27
    DOIs
    Publication statusPublished - 2021 Nov

    Bibliographical note

    Funding Information:
    This research was supported by the MSIT(Ministry of Science and ICT), Korea, under the ITRC(Information Technology Research Center) support program(IITP-2020-2018-0-01405) supervised by the IITP(Institute for Information & Communications Technology Planning & Evaluation) and National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIP) (No.NRF-2017M3C4A7068189). I am very grateful to my friend Yejin Jang for helping me with correcting English.

    Publisher Copyright:
    © 2020, Springer Science+Business Media, LLC, part of Springer Nature.

    Keywords

    • Automatic noise generation
    • Copy mechanism
    • Korean spelling correction
    • Neural machine translation
    • Overcorrection
    • Transformer

    ASJC Scopus subject areas

    • Software
    • Media Technology
    • Hardware and Architecture
    • Computer Networks and Communications

    Fingerprint

    Dive into the research topics of 'Neural spelling correction: translating incorrect sentences to correct sentences for multimedia'. Together they form a unique fingerprint.

    Cite this