loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Stephen Bradshaw 1 ; Colm O'Riordan 1 and Daragh Bradshaw 2

Affiliations: 1 National University Ireland Galway, Ireland ; 2 National University Limerick, Ireland

Keyword(s): Document Clustering, Graph Theory, WordNet, Classification, Word Sense Disambiguation, Data Mining.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Clustering and Classification Methods ; Context Discovery ; Knowledge Discovery and Information Retrieval ; Knowledge-Based Systems ; Symbolic Systems

Abstract: Clustering documents is a common task in a range of information retrieval systems and applications. Many approaches for improving the clustering process have been proposed. One approach is the use of an ontology to better inform the classifier of word context, by expanding the items to be clustered. Wordnet is commonly cited as an appropriate source from which to draw the additional terms; however, it may not be sufficient to achieve strong performance. We have two aims in this paper: first, we show that the use of Wordnet may lead to suboptimal performance. This problem may be accentuated when a document set has been drawn from comments made in social forums; due to the unstructured nature of online conversations compared to standard document sets. Second, we propose a novel method which involves constructing a bespoke ontology that facilitates better clustering. We present a study of clustering applied to a sample of threads from a social forum and investigate the effectiveness of the application of these methods. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.218.76.193

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Bradshaw, S.; O'Riordan, C. and Bradshaw, D. (2017). Improving Document Clustering Performance: The Use of an Automatically Generated Ontology to Augment Document Representations. In Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2017) - KDIR; ISBN 978-989-758-271-4; ISSN 2184-3228, SciTePress, pages 215-223. DOI: 10.5220/0006500202150223

@conference{kdir17,
author={Stephen Bradshaw. and Colm O'Riordan. and Daragh Bradshaw.},
title={Improving Document Clustering Performance: The Use of an Automatically Generated Ontology to Augment Document Representations},
booktitle={Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2017) - KDIR},
year={2017},
pages={215-223},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006500202150223},
isbn={978-989-758-271-4},
issn={2184-3228},
}

TY - CONF

JO - Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2017) - KDIR
TI - Improving Document Clustering Performance: The Use of an Automatically Generated Ontology to Augment Document Representations
SN - 978-989-758-271-4
IS - 2184-3228
AU - Bradshaw, S.
AU - O'Riordan, C.
AU - Bradshaw, D.
PY - 2017
SP - 215
EP - 223
DO - 10.5220/0006500202150223
PB - SciTePress