pabilities combined with brushing and linking allow
the detailed exploration of clusters to understand how
and why they were formed. A fine seriation function-
ality enables users to gain an understanding of cluster
substructures. Finally, a sampling mechanism enables
the fluid exploration of large data sets at multiple lev-
els of detail.
Our design was guided by four usage scenarios,
which were used to demonstrate and assess the proto-
type. The usage scenarios were carried out success-
fully with real-world datasets from the cyber-security
domain. However, they did reveal some weaknesses
in our approach.
Our prototype was developed for the TRIAGE ap-
proach to cluster analysis. It is generic enough to be
used with similar clustering pipelines.
ACKNOWLEDGEMENTS
We would like to thank Olivier Thonnard for his
valuable advice and feedback during the creation of
this paper. The research leading to these results has
received funding from the European Commission’s
Seventh Framework Program (FP7/2007-2013) under
grant agreement no. 257495 (VIS-SENSE).
REFERENCES
Abello, J. and van Ham, F. (2004). Matrix zoom: A visual
interface to semi-external graphs. In IEEE Symposium
on Information Visualization, pages 183–190.
Behrisch, M., Davey, J., Fischer, F., Thonnard, O., Schreck,
T., Keim, D., and Kohlhammer, J. (2014). Vi-
sual analysis of sets of heterogeneous matrices us-
ing projection-based distance functions and semantic
zoom: Visual analysis of sets of heterogeneous matri-
ces. Computer Graphics Forum, 33(3):411–420.
Beliakov, G., Pradera, A., and Calvo, T. (2007). Aggre-
gation Functions: A Guide for Practitioners, volume
221. Springer Berlin Heidelberg, Berlin and Heidel-
berg.
Bertin, J. and Berg, W. J. (2010). Semiology of graph-
ics: Diagrams, networks, maps. ESRI Press and Dis-
tributed by Ingram Publisher Services, Redlands and
Calif, 1st ed edition.
Bremm, S., Schreck, T., Boba, P., Held, S., and Hamacher,
K. (2010). Computing and visually analyzing mutual
information in molecular co-evolution. BMC Bioin-
formatics, 11(1):330.
Choquet, G. (1954). Theory of capacities. Annales de
l’institut Fourier, 5:131–295.
Ellis, G. and Dix, A. (2002). Density control through ran-
dom sampling: an architectural perspective. In Sixth
International Conference on Information Visualisa-
tion, pages 82–90.
Ellis, G. and Dix, A. (2007). A taxonomy of clutter reduc-
tion for information visualisation. IEEE transactions
on visualization and computer graphics, 13(6):1216–
1223.
Everitt, B. S., Landau, S., Leese, M., and Stahl, D. (2011).
Cluster Analysis. John Wiley & Sons.
Fischer, F., Davey, J., Fuchs, J., Thonnard, O., Kohlham-
mer, J., and Keim, D. A. (2014). A visual analytics
field experiment to evaluate alternative visualizations
for cyber security applications. In EuroVis Workshop
on Visual Analytics, pages 43–47, Swansea, UK. Eu-
rographics Association.
Ghoniem, M., Fekete, J.-D., and Castagliola, P. (2005).
On the readability of graphs using node-link and
matrix-based representations: a controlled experiment
and statistical analysis. Information Visualization,
4(2):114–135.
Gower, J. C. (1971). A general coefficient of similarity and
some of its properties. Biometrics, 27(4):857.
Henry, N. and Fekete, J. (2006). MatrixExplorer: a
dual-representation system to explore social networks.
IEEE Transactions on Visualization and Computer
Graphics, 12(5):677–684.
Henry, N., Fekete, J.-D., and McGuffin, M. J. (2007). Node-
Trix: a hybrid visualization of social networks. IEEE
Transactions on Visualization and Computer Graph-
ics, 13(6):1302–1309.
Holten, D. and van Wijk, J. J. (2010). Evaluation of cluster
identification performance for different PCP variants.
Computer Graphics Forum, 29(3):793–802.
Isacenkova, J., Thonnard, O., Costin, A., Balzarotti, D., and
Francillon, A. (2013). Inside the SCAM jungle: A
closer look at 419 scam email operations. In 2013
IEEE Security and Privacy Workshops (SPW), pages
143–150.
Kaufman, L. and Rousseeuw, P. J. (2009). Finding Groups
in Data: An Introduction to Cluster Analysis. John
Wiley & Sons.
Keller, R., Eckert, C. M., and Clarkson, P. J. (2006). Ma-
trices or node-link diagrams: which visual representa-
tion is better for visualising connectivity models? In-
formation Visualization, 5(1):62–76.
Kriegel, H.-P., Kr
¨
oger, P., and Zimek, A. (2009). Clustering
high-dimensional data: A survey on subspace cluster-
ing, pattern-based clustering, and correlation cluster-
ing. ACM Trans. Knowl. Discov. Data, 3(1):1:1–1:58.
Leita, C. and Cova, M. (2011). HARMUR: Storing and ana-
lyzing historic data on malicious domains. In Kirda, E.
and Holz, T., editors, Proceedings of the First Work-
shop on Building Analysis Datasets and Gathering
Experience Returns for Security (BADGERS), pages
46–53.
Leita, C. and Dacier, M. (2008). SGNET: A worldwide
deployable framework to support the analysis of mal-
ware threat models. In 2008 Seventh European De-
pendable Computing Conference EDCC, pages 99–
109.
TheVisualExplorationofAggregateSimilarityforMulti-dimensionalClustering
49