TY - GEN
T1 - Consistency Training with Virtual Adversarial Discrete Perturbation
AU - Park, Jungsoo
AU - Kim, Gyuwan
AU - Kang, Jaewoo
N1 - Funding Information:
We thank Jinhyuk Lee, Jaewook Kang, and Sung-dong Kim for the discussion and feedback on the paper. We also thank the members of the Conversation team in Naver CLOVA for active discussion. This research was supported by National Research Foundation of Korea (NRF-2020R1A2C3010638) and the Ministry of Science and ICT, Korea, under the ICT Creative Consilience program (IITP-2022-2020-0-01819).
Publisher Copyright:
© 2022 Association for Computational Linguistics.
PY - 2022
Y1 - 2022
N2 - Consistency training regularizes a model by enforcing predictions of original and perturbed inputs to be similar. Previous studies have proposed various augmentation methods for the perturbation but are limited in that they are agnostic to the training model. Thus, the perturbed samples may not aid in regularization due to their ease of classification from the model. In this context, we propose an augmentation method of adding a discrete noise that would incur the highest divergence between predictions. This virtual adversarial discrete noise obtained by replacing a small portion of tokens while keeping original semantics as much as possible efficiently pushes a training model's decision boundary. Experimental results show that our proposed method outperforms other consistency training baselines with text editing, paraphrasing, or a continuous noise on semi-supervised text classification tasks and a robustness benchmark.
AB - Consistency training regularizes a model by enforcing predictions of original and perturbed inputs to be similar. Previous studies have proposed various augmentation methods for the perturbation but are limited in that they are agnostic to the training model. Thus, the perturbed samples may not aid in regularization due to their ease of classification from the model. In this context, we propose an augmentation method of adding a discrete noise that would incur the highest divergence between predictions. This virtual adversarial discrete noise obtained by replacing a small portion of tokens while keeping original semantics as much as possible efficiently pushes a training model's decision boundary. Experimental results show that our proposed method outperforms other consistency training baselines with text editing, paraphrasing, or a continuous noise on semi-supervised text classification tasks and a robustness benchmark.
UR - http://www.scopus.com/inward/record.url?scp=85138359178&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85138359178
T3 - NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference
SP - 5646
EP - 5656
BT - NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics
PB - Association for Computational Linguistics (ACL)
T2 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022
Y2 - 10 July 2022 through 15 July 2022
ER -