Abstract
Recently, active research has been conducted on Korean grammatical error correction on machine translation (MT) and automatic noise generation. However, there is no gold-standard test set for objective and official comparative analysis. A significant limitation is measuring the ill-defined performance because the experimental error types in the train set are also included in the test set. Moreover, error types in the training set are also included in the test set. Additionally, the types of errors for qualitative analysis are defined differently with no explicit guidelines. This study proposes a gold-standard test set called the Korean Neural Grammatical Correction Test set (K-NCT) for Korean grammatical error correction using a new error type classification guideline. To ensure the factuality and reliability of the proposal, we conduct a quantitative analysis using a commercialization system and human evaluation. Experimental results demonstrate that the proposed grammatical error correction test set has a well-balanced, diverse, and precise guideline. Our dataset is available at https://github.com/seonminkoo/K-NCT
Original language | English |
---|---|
Pages (from-to) | 118167-118175 |
Number of pages | 9 |
Journal | IEEE Access |
Volume | 10 |
DOIs | |
Publication status | Published - 2022 |
Bibliographical note
Funding Information:This work was supported in part by the Ministry of Science and ICT, South Korea, under the Information Technology Research Center Support Program supervised by the Institute for Information and Communications Technology Planning and Evaluation, under Grant IITP-2018-0-01405; and in part by the Basic Science Research Program through the National Research Foundation of Korea, Ministry of Education, under Grant NRF-2022R1A2C1007616.
Publisher Copyright:
© 2013 IEEE.
Keywords
- Korean grammar correction
- error standard
- gold test set
- human evaluation
ASJC Scopus subject areas
- General Engineering
- General Materials Science
- General Computer Science