TY - GEN
T1 - Adversarial subword regularization for robust neural machine translation
AU - Park, Jungsoo
AU - Sung, Mujeen
AU - Lee, Jinhyuk
AU - Kang, Jaewoo
N1 - Funding Information:
This research was supported by the Na tional Research Foundation of Korea (NRF-2020R1A2C3010638, NRF-2016M3A9A7916996) and Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (grant number: HR20C0021).
Publisher Copyright:
©2020 Association for Computational Linguistics
PY - 2020
Y1 - 2020
N2 - Exposing diverse subword segmentations to neural machine translation (NMT) models often improves the robustness of machine translation as NMT models can experience various subword candidates. However, the diversification of subword segmentations mostly relies on the pre-trained subword language models from which erroneous segmentations of unseen words are less likely to be sampled. In this paper, we present adversarial subword regularization (ADVSR) to study whether gradient signals during training can be a substitute criterion for exposing diverse subword segmentations. We experimentally show that our model-based adversarial samples effectively encourage NMT models to be less sensitive to segmentation errors and improve the performance of NMT models in low-resource and out-domain datasets.
AB - Exposing diverse subword segmentations to neural machine translation (NMT) models often improves the robustness of machine translation as NMT models can experience various subword candidates. However, the diversification of subword segmentations mostly relies on the pre-trained subword language models from which erroneous segmentations of unseen words are less likely to be sampled. In this paper, we present adversarial subword regularization (ADVSR) to study whether gradient signals during training can be a substitute criterion for exposing diverse subword segmentations. We experimentally show that our model-based adversarial samples effectively encourage NMT models to be less sensitive to segmentation errors and improve the performance of NMT models in low-resource and out-domain datasets.
UR - http://www.scopus.com/inward/record.url?scp=85118466211&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85118466211
T3 - Findings of the Association for Computational Linguistics Findings of ACL: EMNLP 2020
SP - 1945
EP - 1953
BT - Findings of the Association for Computational Linguistics Findings of ACL
PB - Association for Computational Linguistics (ACL)
T2 - Findings of the Association for Computational Linguistics, ACL 2020: EMNLP 2020
Y2 - 16 November 2020 through 20 November 2020
ER -