Visualizing the Embedding Space to Explain the Effect of Knowledge Distillation

Hyun Seung Lee, Christian Wallraven

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Recent research has found that knowledge distillation can be effective in reducing the size of a network and in increasing generalization. A pre-trained, large teacher network, for example, was shown to be able to bootstrap a student model that eventually outperforms the teacher in a limited label environment. Despite these advances, it still is relatively unclear why this method works, that is, what the resulting student model does ‘better’. To address this issue, here, we utilize two non-linear, low-dimensional embedding methods (t-SNE and IVIS) to visualize representation spaces of different layers in a network. We perform a set of extensive experiments with different architecture parameters and distillation methods. The resulting visualizations and metrics clearly show that distillation guides the network to find a more compact representation space for higher accuracy already in earlier layers compared to its non-distilled version.

    Original languageEnglish
    Title of host publicationPattern Recognition - 6th Asian Conference, ACPR 2021, Revised Selected Papers
    EditorsChristian Wallraven, Qingshan Liu, Hajime Nagahara
    PublisherSpringer Science and Business Media Deutschland GmbH
    Pages462-475
    Number of pages14
    ISBN (Print)9783031024436
    DOIs
    Publication statusPublished - 2022
    Event6th Asian Conference on Pattern Recognition, ACPR 2021 - Virtual, Online
    Duration: 2021 Nov 92021 Nov 12

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume13189 LNCS
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    Conference6th Asian Conference on Pattern Recognition, ACPR 2021
    CityVirtual, Online
    Period21/11/921/11/12

    Bibliographical note

    Funding Information:
    Acknowledgments. This work was supported by Institute of Information Communications Technology Planning Evaluation (IITP) grant funded by the Korean government (MSIT) (No. 2019-0-00079), Department of Artificial Intelligence, Korea University

    Publisher Copyright:
    © 2022, Springer Nature Switzerland AG.

    Keywords

    • Computer vision
    • Knowledge distillation
    • Limited data learning
    • Transfer learning
    • Visualization

    ASJC Scopus subject areas

    • Theoretical Computer Science
    • General Computer Science

    Fingerprint

    Dive into the research topics of 'Visualizing the Embedding Space to Explain the Effect of Knowledge Distillation'. Together they form a unique fingerprint.

    Cite this