loading
Papers

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Jorge Fernandes ; Andreia Artífice and Manuel J. Fonseca

Affiliation: INESC-ID/ IST/ Technical University of Lisbon, Portugal

ISBN: 978-989-8425-79-9

Keyword(s): LSA, LSA dimension, Unsupervised text classification, Bootstrapping.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Clustering and Classification Methods ; Computational Intelligence ; Evolutionary Computing ; Information Extraction ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Machine Learning ; Soft Computing ; Symbolic Systems

Abstract: Nowadays the size of collections of information achieved considerable sizes, making the finding and exploration of a particular subject hard to achieve. One way to solve this problem is through text classification, where a theme or category is assigned to a text based on the analysis of its content. However, existing approaches to text classification require some effort and a high level of knowledge on this subject by the users, making them inaccessible to the common user. Another problem of current approaches is that they are optimized for a specific problem and can not easily be adapted to another context. In particular, unsupervised methods based on the LSA algorithm require users to define the dimension to use in the algorithm. In this paper we describe an approach to make the use of text classification more accessible to common users, by providing a formula to estimate the dimension of the LSA based on the number of texts used during the bootstrapping process. Experimental result s show that our formula for estimation of the LSA dimension allows us to create unsupervised solutions able to achieve results similar to supervised approaches. (More)

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.85.214.0

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Fernandes, J.; Artífice, A. and J. Fonseca, M. (2011). AUTOMATIC ESTIMATION OF THE LSA DIMENSION.In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011) ISBN 978-989-8425-79-9, pages 301-305. DOI: 10.5220/0003666103090313

@conference{kdir11,
author={Jorge Fernandes. and Andreia Artífice. and Manuel J. Fonseca.},
title={AUTOMATIC ESTIMATION OF THE LSA DIMENSION},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)},
year={2011},
pages={301-305},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003666103090313},
isbn={978-989-8425-79-9},
}

TY - CONF

JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: KDIR, (IC3K 2011)
TI - AUTOMATIC ESTIMATION OF THE LSA DIMENSION
SN - 978-989-8425-79-9
AU - Fernandes, J.
AU - Artífice, A.
AU - J. Fonseca, M.
PY - 2011
SP - 301
EP - 305
DO - 10.5220/0003666103090313

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.