Going metric: Denoising pairwise data

Volker Roth, Julian Laub, Joachim M. Buhmann, Klaus Robert Müller

Research output: Chapter in Book/Report/Conference proceedingConference contribution

30 Citations (Scopus)

Abstract

Pairwise data in empirical sciences typically violate metricity, either due to noise or due to fallible estimates, and therefore are hard to analyze by conventional machine learning technology. In this paper we therefore study ways to work around this problem. First, we present an alternative embedding to multi-dimensional scaling (MDS) that allows us to apply a variety of classical machine learning and signal processing algorithms. The class of pair-wise grouping algorithms which share the shift-invariance property is statistically invariant under this embedding procedure, leading to identical assignments of objects to clusters. Based on this new vectorial representation, denoising methods are applied in a second step. Both steps provide a theoretically well controlled setup to translate from pairwise data to the respective denoised metric representation. We demonstrate the practical usefulness of our theoretical reasoning by discovering structure in protein sequence data bases, visibly improving performance upon existing automatic methods.

Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 15 - Proceedings of the 2002 Conference, NIPS 2002
PublisherNeural information processing systems foundation
ISBN (Print)0262025507, 9780262025508
Publication statusPublished - 2003
Externally publishedYes
Event16th Annual Neural Information Processing Systems Conference, NIPS 2002 - Vancouver, BC, Canada
Duration: 2002 Dec 92002 Dec 14

Publication series

NameAdvances in Neural Information Processing Systems
ISSN (Print)1049-5258

Other

Other16th Annual Neural Information Processing Systems Conference, NIPS 2002
Country/TerritoryCanada
CityVancouver, BC
Period02/12/902/12/14

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'Going metric: Denoising pairwise data'. Together they form a unique fingerprint.

Cite this