Learning to generate word representations using subword information

Yeachan Kim, Kang Min Kim, Ji Min Lee, Sang Keun Lee

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    17 Citations (Scopus)

    Abstract

    Distributed representations of words play a major role in the field of natural language processing by encoding semantic and syntactic information of words. However, most existing works on learning word representations typically regard words as individual atomic units and thus are blind to subword information in words. This further gives rise to a difficulty in representing out-of-vocabulary (OOV) words. In this paper, we present a character-based word representation approach to deal with these limitations. The proposed model learns to generate word representations from characters. In our model, we employ a convolutional neural network and a highway network over characters to extract salient features effectively. Unlike previous models that learn word representations from a large corpus, we take a set of pre-trained word embeddings and generalize it to word entries, including OOV words. To demonstrate the efficacy of the proposed model, we perform both an intrinsic and an extrinsic task which are word similarity and language modeling, respectively. Experimental results show clearly that the proposed model significantly outperforms strong baseline models that regard words or their subwords as atomic units. For example, we achieve as much as 18.5% improvement on average in perplexity for morphologically rich languages compared to strong baselines in the language modeling task.

    Original languageEnglish
    Title of host publicationCOLING 2018 - 27th International Conference on Computational Linguistics, Proceedings
    EditorsEmily M. Bender, Leon Derczynski, Pierre Isabelle
    PublisherAssociation for Computational Linguistics (ACL)
    Pages2551-2561
    Number of pages11
    ISBN (Electronic)9781948087506
    Publication statusPublished - 2018
    Event27th International Conference on Computational Linguistics, COLING 2018 - Santa Fe, United States
    Duration: 2018 Aug 202018 Aug 26

    Publication series

    NameCOLING 2018 - 27th International Conference on Computational Linguistics, Proceedings

    Conference

    Conference27th International Conference on Computational Linguistics, COLING 2018
    Country/TerritoryUnited States
    CitySanta Fe
    Period18/8/2018/8/26

    Bibliographical note

    Funding Information:
    This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT (number 2015R1A2A1A10052665).

    Publisher Copyright:
    © 2018 COLING 2018 - 27th International Conference on Computational Linguistics, Proceedings. All rights reserved.

    ASJC Scopus subject areas

    • Language and Linguistics
    • Computational Theory and Mathematics
    • Linguistics and Language

    Fingerprint

    Dive into the research topics of 'Learning to generate word representations using subword information'. Together they form a unique fingerprint.

    Cite this