Authors:
Claudia Gomez Puyana
and
Alexandra Pomares Quimbaya
Affiliation:
Pontificia Universidad Javeriana, Colombia
Keyword(s):
Text Mining, Summary Generation, Natural Language Processing.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Data Mining
;
Databases and Information Systems Integration
;
Enterprise Information Systems
;
Sensor Networks
;
Signal Processing
;
Soft Computing
Abstract:
The excessive amount of available narrative texts within diverse domains such as health (e.g. medical records), justice (e.g. laws, declarations), assurance (e.g. declarations), etc. increases the required time for the analysis of information in a decision making process. Different approaches of summary generation of these texts have been proposed to solve this problem. However, some of them do not take into account the sequentiality of the original document, which reduces the quality of the final summary, other ones create overall summaries that do not satisfy the end user who requires a summary that is related to his profile (e.g. different medical specializations require different information) and others do not analyze the potential duplication of information and the noise of natural language on the summary. To cope these problems this paper presents GReAT a model for automatic summarization that relies on natural language processing and text mining techniques to extract the most
relevant information from narrative texts focused on the requirements of the end user. GReAT is an extraction based summary generation model which principle is to identify the user’s relevant information filtering the text by topic and frequency of words, also it reduces the number of phrases of the summary avoiding the duplication of information. Experimental results show that the functionality of GReAT improves the quality of the summary over other existing methods.
(More)