Visualizing the Embedding Space to Explain the Effect of Knowledge Distillation

Hyun Seung Lee, Christian Wallraven

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Recent research has found that knowledge distillation can be effective in reducing the size of a network and in increasing generalization. A pre-trained, large teacher network, for example, was shown to be able to bootstrap a student model that eventually outperforms the teacher in a limited label environment. Despite these advances, it still is relatively unclear why this method works, that is, what the resulting student model does ‘better’. To address this issue, here, we utilize two non-linear, low-dimensional embedding methods (t-SNE and IVIS) to visualize representation spaces of different layers in a network. We perform a set of extensive experiments with different architecture parameters and distillation methods. The resulting visualizations and metrics clearly show that distillation guides the network to find a more compact representation space for higher accuracy already in earlier layers compared to its non-distilled version.

Original languageEnglish
Title of host publicationPattern Recognition - 6th Asian Conference, ACPR 2021, Revised Selected Papers
EditorsChristian Wallraven, Qingshan Liu, Hajime Nagahara
PublisherSpringer Science and Business Media Deutschland GmbH
Number of pages14
ISBN (Print)9783031024436
Publication statusPublished - 2022
Event6th Asian Conference on Pattern Recognition, ACPR 2021 - Virtual, Online
Duration: 2021 Nov 92021 Nov 12

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13189 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference6th Asian Conference on Pattern Recognition, ACPR 2021
CityVirtual, Online

Bibliographical note

Funding Information:
Acknowledgments. This work was supported by Institute of Information Communications Technology Planning Evaluation (IITP) grant funded by the Korean government (MSIT) (No. 2019-0-00079), Department of Artificial Intelligence, Korea University

Publisher Copyright:
© 2022, Springer Nature Switzerland AG.


  • Computer vision
  • Knowledge distillation
  • Limited data learning
  • Transfer learning
  • Visualization

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science


Dive into the research topics of 'Visualizing the Embedding Space to Explain the Effect of Knowledge Distillation'. Together they form a unique fingerprint.

Cite this