Abstract
Recent research has found that knowledge distillation can be effective in reducing the size of a network and in increasing generalization. A pre-trained, large teacher network, for example, was shown to be able to bootstrap a student model that eventually outperforms the teacher in a limited label environment. Despite these advances, it still is relatively unclear why this method works, that is, what the resulting student model does ‘better’. To address this issue, here, we utilize two non-linear, low-dimensional embedding methods (t-SNE and IVIS) to visualize representation spaces of different layers in a network. We perform a set of extensive experiments with different architecture parameters and distillation methods. The resulting visualizations and metrics clearly show that distillation guides the network to find a more compact representation space for higher accuracy already in earlier layers compared to its non-distilled version.
Original language | English |
---|---|
Title of host publication | Pattern Recognition - 6th Asian Conference, ACPR 2021, Revised Selected Papers |
Editors | Christian Wallraven, Qingshan Liu, Hajime Nagahara |
Publisher | Springer Science and Business Media Deutschland GmbH |
Pages | 462-475 |
Number of pages | 14 |
ISBN (Print) | 9783031024436 |
DOIs | |
Publication status | Published - 2022 |
Event | 6th Asian Conference on Pattern Recognition, ACPR 2021 - Virtual, Online Duration: 2021 Nov 9 → 2021 Nov 12 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 13189 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 6th Asian Conference on Pattern Recognition, ACPR 2021 |
---|---|
City | Virtual, Online |
Period | 21/11/9 → 21/11/12 |
Bibliographical note
Funding Information:Acknowledgments. This work was supported by Institute of Information Communications Technology Planning Evaluation (IITP) grant funded by the Korean government (MSIT) (No. 2019-0-00079), Department of Artificial Intelligence, Korea University
Publisher Copyright:
© 2022, Springer Nature Switzerland AG.
Keywords
- Computer vision
- Knowledge distillation
- Limited data learning
- Transfer learning
- Visualization
ASJC Scopus subject areas
- Theoretical Computer Science
- General Computer Science