Distillation from Heterogeneous Models for Top-K Recommendation

  • Seong Ku Kang
  • , Wonbin Kweon
  • , Dongha Lee
  • , Jianxun Lian
  • , Xing Xie
  • , Hwanjo Yu*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Recent recommender systems have shown remarkable performance by using an ensemble of heterogeneous models. However, it is exceedingly costly because it requires resources and inference latency proportional to the number of models, which remains the bottleneck for production. Our work aims to transfer the ensemble knowledge of heterogeneous teachers to a lightweight student model using knowledge distillation (KD), to reduce the huge inference costs while retaining high accuracy. Through an empirical study, we find that the efficacy of distillation severely drops when transferring knowledge from heterogeneous teachers. Nevertheless, we show that an important signal to ease the difficulty can be obtained from the teacher's training trajectory. This paper proposes a new KD framework, named HetComp, that guides the student model by transferring easy-to-hard sequences of knowledge generated from the teachers' trajectories. To provide guidance according to the student's learning state, HetComp uses dynamic knowledge construction to provide progressively difficult ranking knowledge and adaptive knowledge transfer to gradually transfer finer-grained ranking information. Our comprehensive experiments show that HetComp significantly improves the distillation quality and the generalization of the student model.

Original languageEnglish
Title of host publicationACM Web Conference 2023 - Proceedings of the World Wide Web Conference, WWW 2023
PublisherAssociation for Computing Machinery, Inc
Pages801-811
Number of pages11
ISBN (Electronic)9781450394161
DOIs
Publication statusPublished - 2023 Apr 30
Externally publishedYes
Event32nd ACM World Wide Web Conference, WWW 2023 - Austin, United States
Duration: 2023 Apr 302023 May 4

Publication series

NameACM Web Conference 2023 - Proceedings of the World Wide Web Conference, WWW 2023

Conference

Conference32nd ACM World Wide Web Conference, WWW 2023
Country/TerritoryUnited States
CityAustin
Period23/4/3023/5/4

Bibliographical note

Publisher Copyright:
© 2023 ACM.

Keywords

  • Easy-to-hard learning
  • Knowledge distillation
  • Model compression
  • Recommender system

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Software

Fingerprint

Dive into the research topics of 'Distillation from Heterogeneous Models for Top-K Recommendation'. Together they form a unique fingerprint.

Cite this