Skip to main navigation Skip to search Skip to main content

Are we truly forgetting? A critical re-examination of machine unlearning evaluation protocols

  • Yongwoo Kim
  • , Sungmin Cha
  • , Donghyun Kim*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Machine unlearning is a process to remove specific data points from a trained model while maintaining the performance on the retain data, addressing privacy or legal requirements. Despite its importance, existing unlearning evaluations tend to focus on logit-based metrics under small-scale scenarios. We observe that this could lead to a false sense of security in unlearning approaches under real-world scenarios. In this paper, we conduct a comprehensive evaluation that employs representation-based evaluations of the unlearned model under large-scale scenarios to verify whether the unlearning approaches truly eliminate the targeted data from the model’s representation perspective. Our analysis reveals that current state-of-the-art unlearning approaches either completely degrade the representational quality of the unlearned model or merely modify the classifier, thereby achieving superior logit-based performance while maintaining representational similarity to the original model. Furthermore, we introduce a novel unlearning evaluation scenario in which the forgetting classes exhibit semantic similarity to downstream task classes, necessitating that feature representations diverge significantly from those of the original model, thus enabling a more thorough evaluation from a representation perspective. We hope our benchmark will serve as a standardized protocol for evaluating unlearning algorithms under realistic conditions.

Original languageEnglish
Article number113785
JournalEngineering Applications of Artificial Intelligence
Volume167
DOIs
Publication statusPublished - 2026 Mar 1

Bibliographical note

Publisher Copyright:
© 2026 Elsevier Ltd.

Keywords

  • Data privacy
  • Machine learning
  • Machine unlearning
  • Representation learning
  • Transfer learning
  • Unlearning evaluation benchmark

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Electrical and Electronic Engineering
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Are we truly forgetting? A critical re-examination of machine unlearning evaluation protocols'. Together they form a unique fingerprint.

Cite this