
see different directions for future work. Primarily, we
plan to extend our benchmark to address our major
internal threat to validity by including more datasets
and layout algorithms. Besides accuracy and percep-
tion, other aspects of quality, e.g., temporal stability,
could be quantified and taken into account, too. Fur-
thermore, besides a quantitative study of text layout
algorithms, a qualitative approach, which categorizes
layouts according to their topological and geometrical
properties, would be interesting. We expect that such
a categorization would be beneficial for choosing DRs
for specific analytics tasks.
ACKNOWLEDGEMENTS
We thank the anonymous reviewers for their valuable
feedback. This work was partially funded by the Ger-
man Ministry for Education and Research (BMBF)
through grant 01IS22062 (“AI research group FFS-
AI”). The work of Tobias Schreck was partially
funded by the Austrian Research Promotion Agency
(FFG) within the framework of the flagship project
ICT of the Future PRESENT, grant FO999899544.
REFERENCES
Aggarwal, C. C. and Zhai, C. (2012a). A survey of text
classification algorithms. In Mining Text Data, pages
163–222. Springer.
Aggarwal, C. C. and Zhai, C. (2012b). A survey of text
clustering algorithms. In Mining Text Data, pages 77–
128. Springer.
Aletras, N. and Stevenson, M. (2013). Evaluating topic co-
herence using distributional semantics. In Proc. 10th
International Conference on Computational Seman-
tics, IWCS ’13, pages 13–22. ACL.
Atzberger, D., Cech, T., de la Haye, M., S
¨
ochting, M.,
Scheibel, W., Limberger, D., and D
¨
ollner, J. (2021).
Software Forest: A visualization of semantic similar-
ities in source code using a tree metaphor. In Proc.
16th International Conference on Information Visual-
ization Theory and Applications – Volume 3, IVAPP
’21, pages 112–122. INSTICC, SciTePress.
Atzberger, D., Cech, T., Scheibel, W., Trapp, M., Richter,
R., D
¨
ollner, J., and Schreck, T. (2023). Large-scale
evaluation of topic models and dimensionality reduc-
tion methods for 2d text spatialization. IEEE Trans-
actions on Visualization and Computer Graphics.
Behrisch, M., Blumenschein, M., Kim, N. W., Shao, L., El-
Assady, M., Fuchs, J., Seebacher, D., Diehl, A., Bran-
des, U., Pfister, H., Schreck, T., Weiskopf, D., and
Keim, D. A. (2018). Quality metrics for information
visualization. Wiley/EG Computer Graphics Forum,
37(3):625–662.
Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent
Dirichlet allocation. Journal of Machine Learning Re-
search, 3:993–1022.
Chang, J., Gerrish, S., Wang, C., Boyd-Graber, J., and Blei,
D. (2009). Reading tea leaves: How humans interpret
topic models. Advances in Neural Information Pro-
cessing Systems, 22.
Cox, M. A. A. and Cox, T. F. (2008). Multidimensional
scaling. In Handbook of Data Visualization, pages
315–347. Springer.
Crain, S. P., Zhou, K., Yang, S.-H., and Zha, H. (2012).
Dimensionality reduction and topic modeling: From
latent semantic indexing to latent Dirichlet allocation
and beyond. In Mining Text Data, pages 129–161.
Springer.
Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer,
T. K., and Harshman, R. (1990). Indexing by latent
semantic analysis. Journal of the American Society
for Information Science, 41(6):391–407.
Espadoto, M., Martins, R. M., Kerren, A., Hirata, N.
S. T., and Telea, A. C. (2021). Toward a quantita-
tive survey of dimension reduction techniques. IEEE
Transactions on Visualization and Computer Graph-
ics, 27(3):2153–2173.
Gisbrecht, A. and Hammer, B. (2015). Data visualization by
nonlinear dimensionality reduction. Wiley Data Min-
ing and Knowledge Discovery, 5(2):51–73.
Joia, P., Coimbra, D., Cuminato, J. A., Paulovich, F. V., and
Nonato, L. G. (2011). Local affine multidimensional
projection. IEEE Transactions on Visualization and
Computer Graphics, 17(12):2563–2571.
Kucher, K. and Kerren, A. (2019). Text visualization revis-
ited: The state of the field in 2019. In Proc. European
Conference on Visualization, EuroVis ’19, pages 29–
31. EG.
Lau, J. H., Newman, D., and Baldwin, T. (2014). Machine
reading tea leaves: Automatically evaluating topic co-
herence and topic model quality. In Proc. 14th Con-
ference of the European Chapter of the Association for
Computational Linguistics, pages 530–539. ACC.
Lee, D. D. and Seung, H. S. (1999). Learning the parts of
objects by non-negative matrix factorization. Springer
Nature, 401(6755):788–791.
McInnes, L., Healy, J., and Melville, J. (2020).
UMAP: Uniform manifold approximation and pro-
jection for dimension reduction. arXiv CoRR,
stat.ML(1802.03426):1–63. pre-print.
Morariu, C., Bibal, A., Cutura, R., Fr
´
enay, B., and Sedlmair,
M. (2023). Predicting user preferences of dimension-
ality reduction embedding quality. IEEE Transactions
on Visualization and Computer Graphics, 29(1):745–
755.
Nenkova, A. and McKeown, K. (2012). A survey of
text summarization techniques. In Mining Text Data,
pages 43–76. Springer.
Noether, G. E. (1981). Why Kendall tau? Wiley Teaching
Statistics, 3(2):41–43.
Paulovich, F. and Minghim, R. (2006). Text map explorer:
a tool to create and explore document maps. In Proc.
Quantifying Topic Model Influence on Text Layouts Based on Dimensionality Reductions
601