Where Did I(T) Put It? - A Holistic Solution to the Automatic Construction of Topic Trees for Navigation

Hans Friedrich Witschel, Barbara Thönssen, Jonas Lutz

2014

Abstract

Managing information based on hierarchical structures is prevailing, be it by storing documents physically in a file structure like MS explorer or virtually in topic trees as in many web applications. The problem is that the structure evolves over time, created individually and hence reflecting individual opinions of how information objects should be grouped. This leads to time consuming searches and error prone retrieval results since relevant documents might be stored elsewhere. Our approach aims at solving the problem by replacing or complementing the manually created navigation structures by automatically created ones. We consider existing approaches for clustering and labelling and focus on yet unrewarding aspects like having information objects in inner nodes (as it is common in folder hierarchies) and cognitively adequate labelling for textual and non-textual resources. Evaluation was done by knowledge experts based on a comparison of retrieval time for finding given documents in manually and automatic generated information structures and showed the advantage of automatically created topic trees.

References

  1. Alfred, R. et al., 2014. Concepts Labeling of Document Clusters Using a Hierarchical Agglomerative Clustering (HAC) Technique. In The 8th International Conference on Knowledge Management in Organizations. pp. 263-272.
  2. Blundell, C., Teh, Y. W. & Heller, K., 2010. Bayesian Rose Trees. In Proceedings of UAI-10. pp. 65-72.
  3. Bruls, M., Huizing, K. & Van Wijk, J.J., 2000. Squarified treemaps. In Data Visualization 2000. Vienna, Austria: Springer, pp. 33-42.
  4. Caraballo, S., 1999. Automatic Acquisition of a hypernym-labeled noun hierarchy from text. In Proceedings of the Association for Computational Linguistics Conference.
  5. Chuang, S.-L. & Chien, L.-F., 2004. A practical webbased approach to generating topic hierarchy for text segments. In Proceedings of CIKM 7804. p. 127.
  6. Cios, K., Pedrycz, W. & Swiniarski, R. W., 1998. Data mining methods for knowledge discovery, Norwell, MA, USA: Kluwer Academic Publishers.
  7. Cutting, D. R., Karger, D. R. & Pedersen, J.O., 1993. Constant interaction-time scatter/gather browsing of very large document collections. In Proceedings of SIGIR 7893. pp. 126-134.
  8. Glover, E. et al., 2002. Inferring hierarchical descriptions. In Proceedings of CIKM 7802. ACM Press.
  9. Lawrie, D., Croft, W. B. & Rosenberg, A., 2001. Finding topic words for hierarchical summarization. In Proceedings of SIGIR 7801. ACM Press, pp. 349-357.
  10. Lutz, J., Thönssen, B. & Witschel, H.F., 2013. Breaking free from your information prison. A recommender based on semantically enriched context descriptions. In 1st International Conference on Enterprise Systems.
  11. Muller, A. et al., 1999. The TaxGen framework: automating the generation of a taxonomy for a large document collection. In Proc. of HICSS-32. p. 9.
  12. Popescul, A. & Ungar, L. H., 2000. Automatic labeling of document clusters. Available at: http://citeseer.nj. nec.com/popescul00automatic.html.
  13. Radev, D.R. et al., 2004. Centroid-based summarization of multiple documents. Information Processing & Management, 40(6), pp.919-938.
  14. Thönssen, B., 2013. Automatic, Format-independent Generation of Metadata for Documents Based on Semantically Enriched Context Information. University of Camerino.
  15. Treeratpituk, P. & Callan, J., 2006. Automatically labeling hierarchical clusters. In Proceedings of dg.o 7806. ACM Press, p. 167.
  16. Zavitsanos, E., Paliouras, G. & Vouros, G. A., 2011. NonParametric Estimation of Topic Hierarchies from Texts with Hierarchical Dirichlet Processes. The Journal of Machine Learning Research, 12, pp.2749- 2775.
Download


Paper Citation


in Harvard Style

Friedrich Witschel H., Thönssen B. and Lutz J. (2014). Where Did I(T) Put It? - A Holistic Solution to the Automatic Construction of Topic Trees for Navigation . In Proceedings of the International Conference on Knowledge Management and Information Sharing - Volume 1: KMIS, (IC3K 2014) ISBN 978-989-758-050-5, pages 194-202. DOI: 10.5220/0005075201940202


in Bibtex Style

@conference{kmis14,
author={Hans Friedrich Witschel and Barbara Thönssen and Jonas Lutz},
title={Where Did I(T) Put It? - A Holistic Solution to the Automatic Construction of Topic Trees for Navigation},
booktitle={Proceedings of the International Conference on Knowledge Management and Information Sharing - Volume 1: KMIS, (IC3K 2014)},
year={2014},
pages={194-202},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005075201940202},
isbn={978-989-758-050-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Knowledge Management and Information Sharing - Volume 1: KMIS, (IC3K 2014)
TI - Where Did I(T) Put It? - A Holistic Solution to the Automatic Construction of Topic Trees for Navigation
SN - 978-989-758-050-5
AU - Friedrich Witschel H.
AU - Thönssen B.
AU - Lutz J.
PY - 2014
SP - 194
EP - 202
DO - 10.5220/0005075201940202