loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Thomas Lansdall-Welfare ; Ilias Flaounas and Nello Cristianini

Affiliation: University of Bristol, United Kingdom

Keyword(s): Text categorisation, Graph construction, Label propagation, Large scale.

Related Ontology Subjects/Areas/Topics: Classification ; Ensemble Methods ; Pattern Recognition ; Sparsity ; Theory and Methods

Abstract: The efficient annotation of documents in vast corpora calls for scalable methods of text classification. Representing the documents in the form of graph vertices, rather than in the form of vectors in a bag of words space, allows for the necessary information to be pre-computed and stored. It also fundamentally changes the problem definition, from a content-based to a relation-based classification problem. Efficiently creating a graph where nearby documents are likely to have the same annotation is the central task of this paper. We compare the effectiveness of various approaches to graph construction by building graphs of 800,000 vertices based on the Reuters corpus, showing that relation-based classification is competitive with Support VectorMachines, which can be considered as state of the art. We further show that the combination of our relation-based approach and Support Vector Machines leads to an improvement over the methods individually.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 54.227.136.157

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Lansdall-Welfare, T.; Flaounas, I. and Cristianini, N. (2012). SCALABLE CORPUS ANNOTATION BY GRAPH CONSTRUCTION AND LABEL PROPAGATION. In Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM; ISBN 978-989-8425-98-0; ISSN 2184-4313, SciTePress, pages 25-34. DOI: 10.5220/0003728700250034

@conference{icpram12,
author={Thomas Lansdall{-}Welfare. and Ilias Flaounas. and Nello Cristianini.},
title={SCALABLE CORPUS ANNOTATION BY GRAPH CONSTRUCTION AND LABEL PROPAGATION},
booktitle={Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM},
year={2012},
pages={25-34},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003728700250034},
isbn={978-989-8425-98-0},
issn={2184-4313},
}

TY - CONF

JO - Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM
TI - SCALABLE CORPUS ANNOTATION BY GRAPH CONSTRUCTION AND LABEL PROPAGATION
SN - 978-989-8425-98-0
IS - 2184-4313
AU - Lansdall-Welfare, T.
AU - Flaounas, I.
AU - Cristianini, N.
PY - 2012
SP - 25
EP - 34
DO - 10.5220/0003728700250034
PB - SciTePress