A NOVEL SUPERVISED TEXT CLASSIFIER FROM A SMALL TRAINING SET

Fabio Clarizia; Francesco Colace; Massimo De Santo; Luca Greco; Paolo Napoletano

Research.Publish.Connect.

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

A NOVEL SUPERVISED TEXT CLASSIFIER FROM A SMALL TRAINING SET

Topics: Clustering and Classification Methods; Information Extraction; Machine Learning; Mining Text and Semi-Structured Data

In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - Volume 1: SSTM, 537-545, 2011 , Paris, France

Authors: Fabio Clarizia ; Francesco Colace ; Massimo De Santo ; Luca Greco and Paolo Napoletano

Affiliation: University of Salerno, Italy

Keyword(s): Text classification, Term extraction, Probabilistic topic model.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Clustering and Classification Methods ; Computational Intelligence ; Evolutionary Computing ; Information Extraction ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Machine Learning ; Mining Text and Semi-Structured Data ; Soft Computing ; Symbolic Systems

Abstract: Text classification methods have been evaluated on supervised classification tasks of large datasets showing high accuracy. Nevertheless, due to the fact that these classifiers, to obtain a good performance on a test set, need to learn from many examples, some difficulties may be found when they are employed in real contexts. In fact, most users of a practical system do not want to carry out labeling tasks for a long time only to obtain a better level of accuracy. They obviously prefer algorithms that have high accuracy, but do not require a large amount of manual labeling tasks. In this paper we propose a new supervised method for single-label text classification, based on a mixed Graph of Terms, that is capable of achieving a good performance, in term of accuracy, when the size of the training set is 1% of the original. The mixed Graph of Terms can be automatically extracted from a set of documents following a kind of term clustering technique weighted by the probabilistic topic mo del. The method has been tested on the top 10 classes of the ModApte split from the Reuters-21578 dataset and learnt on 1% of the original training set. Results have confirmed the discriminative property of the graph and have confirmed that the proposed method is comparable with existing methods learnt on the whole training set. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 3.15.219.64

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Clarizia, F.; Colace, F.; De Santo, M.; Greco, L. and Napoletano, P. (2011). A NOVEL SUPERVISED TEXT CLASSIFIER FROM A SMALL TRAINING SET. In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (IC3K 2011) - SSTM; ISBN 978-989-8425-79-9; ISSN 2184-3228, SciTePress, pages 537-545. DOI: 10.5220/0003661105450553

@conference{sstm11,
author={Fabio Clarizia. and Francesco Colace. and Massimo {De Santo}. and Luca Greco. and Paolo Napoletano.},
title={A NOVEL SUPERVISED TEXT CLASSIFIER FROM A SMALL TRAINING SET},
booktitle={Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (IC3K 2011) - SSTM},
year={2011},
pages={537-545},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003661105450553},
isbn={978-989-8425-79-9},
issn={2184-3228},
}

TY - CONF

JO - Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (IC3K 2011) - SSTM
TI - A NOVEL SUPERVISED TEXT CLASSIFIER FROM A SMALL TRAINING SET
SN - 978-989-8425-79-9
IS - 2184-3228
AU - Clarizia, F.
AU - Colace, F.
AU - De Santo, M.
AU - Greco, L.
AU - Napoletano, P.
PY - 2011
SP - 537
EP - 545
DO - 10.5220/0003661105450553
PB - SciTePress