Authors:
Vladimir Molchanov
1
and
Lars Linsen
2
Affiliations:
1
Jacobs University, Germany
;
2
Jacobs University and Westfälische Wilhelms-Universität Münster, Germany
Keyword(s):
Multi-dimensional Data Visualization, Multi-field Data, Clustering.
Related
Ontology
Subjects/Areas/Topics:
Abstract Data Visualization
;
Computer Vision, Visualization and Computer Graphics
;
High-Dimensional Data and Dimensionality Reduction
;
Multi-Field Visualization
;
Spatial Data Visualization
Abstract:
Visual analytics of multidimensional data suffer from the curse of dimensionality, i.e., that even large numbers of data points will be scattered in a high-dimensional space. The curse of dimensionality prohibits the proper use of clustering algorithms in the high-dimensional space. Projecting the space before clustering imposes a loss of information and possible mixing of separated clusters. We present an approach where we overcome the curse of dimensionality for a particular type of multidimensional data, namely for attribute spaces of multivariate volume data. For multivariate volume data, it is possible to interpolate between the data points in the high-dimensional attribute space based on their spatial relationship in the volumetric domain (or physical space). We apply this idea to a histogram-based clustering algorithm. We create a uniform partition of the attribute space in multidimensional bins and compute a histogram indicating the number of data samples belonging to each bi
n. Only non-empty bins are stored for efficiency. Without interpolation, the analysis is highly sensitive to the cell sizes yielding inaccurate clustering for improper choices: Large cells result in no cluster separation, while clusters fall apart for small cells. Using tri-linear interpolation in physical space, we can refine the data by generating additional samples. The refinement scheme can adapt to the data point distribution in attribute space and the histogram’s bin size. As a consequence, we can generate a density computation, where clusters stay connected even when using very small cell sizes. We exploit this result to create a robust hierarchical cluster tree. It can be visually explored using coordinated views to physical space visualizations and to parallel coordinates plots. We apply our technique to several datasets and compare the results against results without interpolation.
(More)