Efficient Pre-training of Masked Language Model via Concept-based Curriculum Masking

Mingyu Lee, Jun Hyung Park, Junho Kim, Kang Min Kim, Sang Keun Lee

Research output: Contribution to conferencePaperpeer-review

5 Citations (Scopus)

Abstract

Masked language modeling (MLM) has been widely used for pre-training effective bidirectional representations, but incurs substantial training costs. In this paper, we propose a novel concept-based curriculum masking (CCM) method to efficiently pre-train a language model. CCM has two key differences from existing curriculum learning approaches to effectively reflect the nature of MLM. First, we introduce a carefully-designed linguistic difficulty criterion that evaluates the MLM difficulty of each token. Second, we construct a curriculum that gradually masks words related to the previously masked words by retrieving a knowledge graph. Experimental results show that CCM significantly improves pre-training efficiency. Specifically, the model trained with CCM shows comparative performance with the original BERT on the General Language Understanding Evaluation benchmark at half of the training cost. Code is available at https://github.com/KoreaMGLEE/Concept-based-curriculum-masking.

Original languageEnglish
Pages7417-7427
Number of pages11
Publication statusPublished - 2022
Event2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 - Abu Dhabi, United Arab Emirates
Duration: 2022 Dec 72022 Dec 11

Conference

Conference2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
Country/TerritoryUnited Arab Emirates
CityAbu Dhabi
Period22/12/722/12/11

Bibliographical note

Publisher Copyright:
© 2022 Association for Computational Linguistics.

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Information Systems

Fingerprint

Dive into the research topics of 'Efficient Pre-training of Masked Language Model via Concept-based Curriculum Masking'. Together they form a unique fingerprint.

Cite this