Authors:
Desislava Decheva
1
and
Lars Linsen
2
Affiliations:
1
Jacobs University, Germany
;
2
Westfälische Wilhelms-Universität Münster, Germany
Keyword(s):
Multidimensional Data Visualization, Projection Methods, Visual Clutter Reduction.
Related
Ontology
Subjects/Areas/Topics:
Abstract Data Visualization
;
Computer Vision, Visualization and Computer Graphics
;
High-Dimensional Data and Dimensionality Reduction
Abstract:
Visualization of unlabeled multidimensional data is commonly performed using projections to a 2D visual
space, which supports an investigative interactive analysis. However, static views obtained by a projection
method like Principal Component Analysis (PCA) may not capture well all data features. Moreover. in case
of large data with many samples, the scatterplots suffer from overplotting, which hinders analysis purposes.
Clustering tools allow for aggregation of data to meaningful structures. Clustering methods like K-means,
however, also suffer from drawbacks. We present a novel approach to visually encode aggregated data in
projected views and to interactively explore the data. We make use of the benefits of PCA and K-means
clustering, but overcome their main drawbacks. The sensitivity of K-means to outlier points is ameliorated,
while the sensitivity of PCA to axis scaling is converted into a powerful flexibility, allowing the user to
change observation perspective by rescaling t
he original axes. Analysis of both clusters and outliers is
facilitated. Properties of clusters are visually encoded in aggregated form using color and size or examined
in detail via local scatterplots or local circular parallel coordinate plots. The granularity of the data
aggregation process can be adjusted interactively. A star coordinate interaction widget allows for modifying
the projection matrix. To convey how much the projection maintains neighborhoods, we use a distance
encoding. We evaluate our tool using synthetic and real-world data sets and perform a user study to evaluate
its effectiveness.
(More)