TY - GEN
T1 - Heterogeneous data fusion via space alignment using nonmetric multidimensional scaling
AU - Choo, Jaegul
AU - Bohn, Shawn
AU - Nakamura, Grant C.
AU - White, Amanda M.
AU - Park, Haesun
PY - 2012
Y1 - 2012
N2 - Heterogeneous data sets are typically represented in different feature spaces, making it difficult to analyze relationships spanning different data sets even when they are semantically related. Data fusion via space alignment can remedy this task by integrating multiple data sets lying in different spaces into one common space. Given a set of reference correspondence data that share the same semantic meaning across different spaces, space alignment attempts to place the corresponding reference data as close together as possible, and accordingly, the entire data are aligned in a common space. Space alignment involves optimizing two potentially conflicting criteria: minimum deformation of the original relationships and maximum alignment between the different spaces. To solve this problem, we provide a novel graph embedding framework for space alignment, which converts each data set into a graph and assigns zero distance between reference correspondence pairs resulting in a single graph. We propose a graph embedding method for fusion based on nonmetric multidimensional scaling (MDS). Its criteria using the rank order rather than the distance allows nonmetric MDS to effectively handle both deformation and alignment. Experiments using parallel data sets demonstrate that our approach works well in comparison to existing methods such as constrained Laplacian eigenmaps, Procrustes analysis, and tensor decomposition. We also present standard cross-domain information retrieval tests as well as interesting visualization examples using space alignment.
AB - Heterogeneous data sets are typically represented in different feature spaces, making it difficult to analyze relationships spanning different data sets even when they are semantically related. Data fusion via space alignment can remedy this task by integrating multiple data sets lying in different spaces into one common space. Given a set of reference correspondence data that share the same semantic meaning across different spaces, space alignment attempts to place the corresponding reference data as close together as possible, and accordingly, the entire data are aligned in a common space. Space alignment involves optimizing two potentially conflicting criteria: minimum deformation of the original relationships and maximum alignment between the different spaces. To solve this problem, we provide a novel graph embedding framework for space alignment, which converts each data set into a graph and assigns zero distance between reference correspondence pairs resulting in a single graph. We propose a graph embedding method for fusion based on nonmetric multidimensional scaling (MDS). Its criteria using the rank order rather than the distance allows nonmetric MDS to effectively handle both deformation and alignment. Experiments using parallel data sets demonstrate that our approach works well in comparison to existing methods such as constrained Laplacian eigenmaps, Procrustes analysis, and tensor decomposition. We also present standard cross-domain information retrieval tests as well as interesting visualization examples using space alignment.
UR - http://www.scopus.com/inward/record.url?scp=84875848281&partnerID=8YFLogxK
U2 - 10.1137/1.9781611972825.16
DO - 10.1137/1.9781611972825.16
M3 - Conference contribution
AN - SCOPUS:84875848281
SN - 9781611972320
T3 - Proceedings of the 12th SIAM International Conference on Data Mining, SDM 2012
SP - 177
EP - 188
BT - Proceedings of the 12th SIAM International Conference on Data Mining, SDM 2012
PB - Society for Industrial and Applied Mathematics Publications
T2 - 12th SIAM International Conference on Data Mining, SDM 2012
Y2 - 26 April 2012 through 28 April 2012
ER -