Corruption-based anomaly detection and interpretation in tabular data

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Recent advances in self-supervised learning (SSL) have proven crucial in effectively learning representations of unstructured data, encompassing text, images, and audio. Although the applications of these advances in anomaly detection have been explored extensively, applying SSL to tabular data presents challenges because of the absence of prior information on data structure. In response, we propose a framework for anomaly detection in tabular datasets using variable corruption. Through selective variable corruption and assignment of new labels based on the degree of corruption, our framework can effectively distinguish between normal and abnormal data. Furthermore, analyzing the impact of corruption on anomaly scores aids in the identification of important variables. Experimental results obtained from various tabular datasets validate the precision and applicability of the proposed method. The source code can be accessed at https://github.com/mokch/CAIT.

Original languageEnglish
Article number111149
JournalPattern Recognition
Volume159
DOIs
Publication statusPublished - 2025 Mar

Bibliographical note

Publisher Copyright:
© 2024

Keywords

  • Anomaly detection
  • Explainable artificial intelligence
  • Self-supervised learning
  • Tabular data
  • Variable corruption

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Corruption-based anomaly detection and interpretation in tabular data'. Together they form a unique fingerprint.

Cite this