attribute mapping. The approach is exemplified using
mid-sized open-source software development projects.
An application to the data sets indicates both an ef-
fective layout base on semantic similarity of source
code and an explorative visualization approach of the
resulting maps.
Future work can be discussed separately for layout
generation and the visualization approach. For lay-
out improvements, software similarity (Al-msie’deen
et al., 2013) and suitable dimensionality reduction
methods (Cha, 2007; Vernier et al., 2020) are target
for further research. The visualization approach needs
to be evaluated regarding readability using user stud-
ies. Regarding variability of the 3D models, tooling
on the glTF models would simplify the integration
of other 3D models, even for applications outside of
forest metaphors. As the input for the prototypical
viewer are CSV files with explicit layout and addi-
tional attributes as columns, we already imagine easy
visualization prototyping and adaption in other visual-
ization domains as well. Using a broader perspective,
the proposed visualization approach allows for visual-
ization of structured data sets with scatter-plot layout
and dynamic model mapping for a wide-spread use of
thematic mapping – the topic maps.
ACKNOWLEDGEMENTS
We want to thank the anonymous reviewers for their
valuable comments and suggestions to improve this ar-
ticle. Further, this work is part of the “Software-DNA”
project, which is funded by the European Regional De-
velopment Fund (ERDF – or EFRE in German) and the
State of Brandenburg (ILB) as well as the “TASAM”
project, which is funded by the German Federal Min-
istry for Economic Affairs and Energy (BMWi, ZIM).
REFERENCES
Al-msie’deen, R., Seriai, A.-D., Huchard, M., Urtado, C.,
and Vauttier, S. (2013). Mining features from the
object-oriented source code of software variants by
combining lexical and structural similarity. In Proc.
14th International Conference on Information Reuse &
Integration, IRI ’13, pages 586–593. IEEE.
Auber, D., Huet, C., Lambert, A., Renoust, B., Sallaberry,
A., and Saulnier, A. (2013). Gospermap: Using a
gosper curve for laying out hierarchical data. IEEE
Transactions on Visualization and Computer Graphics,
19(11):1820–1832.
Balogh, G., Szabolics, A., and Besz
´
edes,
´
A. (2015).
Codemetropolis: Eclipse over the city of source code.
In Proc. 15th International Working Conference on
Source Code Analysis and Manipulation, SCAM ’15,
pages 271–276. IEEE.
Barth, L., Fabrikant, S. I., Kobourov, S. G., Lubiw, A.,
N
¨
ollenburg, M., Okamoto, Y., Pupyrev, S., Squarcella,
C., Ueckerdt, T., and Wolff, A. (2014). Semantic word
cloud representations: Hardness and approximation
algorithms. In Latin American Symposium on Theoret-
ical Informatics, pages 514–525. Springer.
Beck, F. (2014). Software feathers - figurative visualiza-
tion of software metrics. In Proc. 5th International
Conference on Information Visualization Theory and
Applications - Volume 1: IVAPP, IVAPP ’14, pages
5–16. INSTICC, SciTePress.
Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent
dirichlet allocation. Journal of Machine Learning Re-
search, 3:993–1022.
Bruneton, E. and Neyret, F. (2012). Real-time realistic ren-
dering and lighting of forests. Computer Graphics
Forum, 31(2pt1):373–382.
Cha, S.-H. (2007). Comprehensive survey on distance/simi-
larity measures between probability density functions.
International Journal of Mathematical Models and
Methods in Applied Sciences, 1(4):300–307.
Cornelissen, B., Zaidman, A., Holten, D., Moonen, L.,
Deursen, A., and van Wijk, J. (2008). Execution trace
analysis through massive sequence and circular bundle
views. Journal of Systems and Software, 81:2252–
2268.
Cox, M. A. and Cox, T. F. (2008). Multidimensional scaling.
In Handbook of Data Visualization, pages 315–347.
Springer.
Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer,
T. K., and Harshman, R. (1990). Indexing by latent
semantic analysis. Journal of the American Society for
Information Science, 41(6):391–407.
D
¨
ubel, S., R
¨
ohlig, M., Schumann, H., and Trapp, M. (2014).
2d and 3d presentation of spatial data: A systematic
review. In Proc. VIS International Workshop on 3DVis,
3DVis ’14, pages 11–18. IEEE.
Hawes, N., Marshall, S., and Anslow, C. (2015). Codesur-
veyor: Mapping large-scale software to aid in code
comprehension. In Proc. 3rd Working Conference on
Software Visualization, VISSOFT ’15, pages 96–105.
IEEE.
Holten, D., Vliegen, R., and van Wijk, J. (2005). Visual real-
ism for the visualization of software metrics. In Proc.
3rd International Workshop on Visualizing Software
for Understanding and Analysis, VISSOFT ’05, pages
1–6. IEEE.
Kleiberg, E., van De Wetering, H., and Van Wijk, J. J. (2001).
Botanical visualization of huge hierarchies. In Informa-
tion visualization, IEEE symposium on, pages 87–87.
Kohonen, T. (1997). Exploration of very large databases
by self-organizing maps. In Proc. International Con-
ference on Neural Networks, ICNN ’97, pages 1–6.
IEEE.
Kuhn, A., Loretan, P., and Nierstrasz, O. (2008). Consistent
layout for thematic software maps. In Proc. 15th Work-
ing Conference on Reverse Engineering, WCRE ’08,
pages 209–218. IEEE.
Software Forest: A Visualization of Semantic Similarities in Source Code using a Tree Metaphor
121