loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Flora Amato ; Francesco Gargiulo ; Antonino Mazzeo and Carlo Sansone

Affiliation: University of Naples Federico II, Italy

Keyword(s): Topic Detection, Clustering, t f −id f , Feature Reduction.

Abstract: Topics extraction has become increasingly important due to its effectiveness in many tasks, including information filtering, information retrieval and organization of document collections in digital libraries. The Topic Detection consists to find the most significant topics within a document corpus. In this paper we explore the adoption of a methodology of feature reduction to underline the most significant topics within a document corpus. We used an approach based on a clustering algorithm (X-means) over the t f −id f matrix calculated starting from the corpus, by which we describe the frequency of terms, represented by the columns, that occur in each document, represented by a row. To extract the topics, we build n binary problems, where n is the numbers of clusters produced by an unsupervised clustering approach and we operate a supervised feature selection over them considering the top features as the topic descriptors. We will show the results obtained on two different corpora. Both collections are expressed in Italian: the first collection consists of documents of the University of Naples Federico II, the second one consists in a collection of medical records. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.141.27.244

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Amato, F.; Gargiulo, F.; Mazzeo, A. and Sansone, C. (2014). A Method of Topic Detection for Great Volume of Data. In Proceedings of 3rd International Conference on Data Management Technologies and Applications (DATA 2014) - KomIS; ISBN 978-989-758-035-2; ISSN 2184-285X, SciTePress, pages 434-439. DOI: 10.5220/0005145504340439

@conference{komis14,
author={Flora Amato. and Francesco Gargiulo. and Antonino Mazzeo. and Carlo Sansone.},
title={A Method of Topic Detection for Great Volume of Data},
booktitle={Proceedings of 3rd International Conference on Data Management Technologies and Applications (DATA 2014) - KomIS},
year={2014},
pages={434-439},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005145504340439},
isbn={978-989-758-035-2},
issn={2184-285X},
}

TY - CONF

JO - Proceedings of 3rd International Conference on Data Management Technologies and Applications (DATA 2014) - KomIS
TI - A Method of Topic Detection for Great Volume of Data
SN - 978-989-758-035-2
IS - 2184-285X
AU - Amato, F.
AU - Gargiulo, F.
AU - Mazzeo, A.
AU - Sansone, C.
PY - 2014
SP - 434
EP - 439
DO - 10.5220/0005145504340439
PB - SciTePress