loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Alejandro Sierra Múnera 1 ; Alexandra Pomares Quimbaya 2 ; Rafael Andrés González Rivera 2 ; Julián Camilo Daza Rodríguez 3 ; Oscar Mauricio Muñoz Velandia 1 and Angel Alberto Garcia Peña 1

Affiliations: 1 Pontificia Universidad Javeriana and Hospital Universitario San Ignacio, Colombia ; 2 Pontificia Universidad Javeriana, Colombia ; 3 Hospital Universitario San Ignacio, Colombia

Keyword(s): Text Classification, Training Set, Labeling.

Related Ontology Subjects/Areas/Topics: Artificial Intelligence ; Knowledge Management and Information Sharing ; Knowledge-Based Systems ; Symbolic Systems ; Tools and Technology for Knowledge Management

Abstract: Most text classification techniques rely on the existence of training data sets that are required to build models. However, in many text classification projects, the availability of previously labeled texts is not frequent due to differences in language (e.g. Spanish), domain (e.g. healthcare) and regional or institutional written culture (e.g. specific hospital). In order to contribute to dealing with this problem, this paper presents LABAS-TS, a web-enabled system for assisting the open, collaborative labeling of training sets for text classification. LABAS-TS is framed within a named entity recognition approach that identifies important entities from a domain-specific corpus, based on gazetteers, and uses a language specific sentence analyzer that extracts the portions of text that should be annotated. LABAS-TS was evaluated in the generation of training data sets to classify whether an electronic health record text contains a diagnosis, a test or a procedure, and demonstrated its utility in reducing the required time for building a reliable training set, with an average of eleven seconds between two labels. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.147.73.35

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Sierra Múnera, A.; Pomares Quimbaya, A.; Andrés González Rivera, R.; Daza Rodríguez, J.; Muñoz Velandia, O. and Garcia Peña, A. (2017). LABAS-TS - A System for Assisting Labeling of Training Sets for Text Classification. In Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2017) - KMIS; ISBN 978-989-758-273-8; ISSN 2184-3228, SciTePress, pages 174-180. DOI: 10.5220/0006504901740180

@conference{kmis17,
author={Alejandro {Sierra Múnera}. and Alexandra {Pomares Quimbaya}. and Rafael {Andrés González Rivera}. and Julián Camilo {Daza Rodríguez}. and Oscar Mauricio {Muñoz Velandia}. and Angel Alberto {Garcia Peña}.},
title={LABAS-TS - A System for Assisting Labeling of Training Sets for Text Classification},
booktitle={Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2017) - KMIS},
year={2017},
pages={174-180},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006504901740180},
isbn={978-989-758-273-8},
issn={2184-3228},
}

TY - CONF

JO - Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2017) - KMIS
TI - LABAS-TS - A System for Assisting Labeling of Training Sets for Text Classification
SN - 978-989-758-273-8
IS - 2184-3228
AU - Sierra Múnera, A.
AU - Pomares Quimbaya, A.
AU - Andrés González Rivera, R.
AU - Daza Rodríguez, J.
AU - Muñoz Velandia, O.
AU - Garcia Peña, A.
PY - 2017
SP - 174
EP - 180
DO - 10.5220/0006504901740180
PB - SciTePress