Delving into Instance-Dependent Label Noise in Graph Data: A Comprehensive Study and Benchmark

  • Suyeon Kim
  • , Seong Ku Kang*
  • , Dongwoo Kim
  • , Jungseul Ok
  • , Hwanjo Yu
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Graph Neural Networks (GNNs) have achieved state-of-the-art performance in node classification tasks but struggle with label noise in real-world data. Existing studies on graph learning with label noise commonly rely on class-dependent label noise, overlooking the complexities of instance-dependent noise and falling short of capturing real-world corruption patterns. We introduce BeGIN (Benchmarking for Graphs with Instance-dependent Noise), a new benchmark that provides realistic graph datasets with various noise types and comprehensively evaluates noise-handling strategies across GNN architectures, noisy label detection, and noise-robust learning. To simulate instance-dependent corruptions, BeGIN introduces algorithmic methods and LLM-based simulations. Our experiments reveal the challenges of instance-dependent noise, particularly LLM-based corruption, and underscore the importance of node-specific parameterization to enhance GNN robustness. By comprehensively evaluating noise-handling strategies, BeGIN provides insights into their effectiveness, efficiency, and key performance factors. We expect that BeGIN will serve as a valuable resource for advancing research on label noise in graphs and fostering the development of robust GNN training methods. The code is available at https://github.com/kimsu55/BeGIN.

Original languageEnglish
Title of host publicationKDD 2025 - Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages5539-5550
Number of pages12
ISBN (Electronic)9798400714542
DOIs
Publication statusPublished - 2025 Aug 3
Event31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2025 - Toronto, Canada
Duration: 2025 Aug 32025 Aug 7

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Volume2
ISSN (Print)2154-817X

Conference

Conference31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2025
Country/TerritoryCanada
CityToronto
Period25/8/325/8/7

Bibliographical note

Publisher Copyright:
© 2025 Copyright held by the owner/author(s).

Keywords

  • graph neural networks
  • label noise
  • node classification

ASJC Scopus subject areas

  • Software
  • Information Systems

Fingerprint

Dive into the research topics of 'Delving into Instance-Dependent Label Noise in Graph Data: A Comprehensive Study and Benchmark'. Together they form a unique fingerprint.

Cite this