loading
Documents

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Rachid Aknouche ; Ounas Asfari ; Fadila Bentayeb and Omar Boussaid

Affiliation: ERIC Laboratory, France

ISBN: 978-989-8565-66-2

Keyword(s): Extract-Transform-Load,Textual Data,Text Warehousing,Text Warehouse Model, TWM, data integration, decisional architecture, information retrieval, 20 Newsgroups

Abstract: In this paper, we propose an original approach for text warehousing process. It is based on a decisional architecture which combines classical data warehousing tasks and information retrieval (IR) techniques. We first propose a new ETL process, named ETL-Text, for textual data integration and then, we present a new Text Warehouse Model, denoted TWM, which takes into account both the structure and the semantics of the textual data. TWM is associated with new dimensions types including: a metadata dimension and a semantic dimension. In addition, we propose a new analysis measure based on the modeling language widely used in IR area. Moreover, our approach is based on Wikipedia as external knowledge source to extract the semantics of the textual documents. To validate our approach, we develop a prototype composed of several processing modules that illustrate the different steps of the ETL-Text. Also, we use the 20 Newsgroups corpus to perform our experimentations.

PDF ImageFull Text

Download
CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.89.87.12

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Aknouche, R.; Asfari, O.; Bentayeb, F. and Boussaid, O. (2013). Integration Process for Multidimensional Textual Data Modeling.In Proceedings of the 1st International Workshop in Software Evolution and Modernization - Volume 1: SEM, (ENASE 2013) ISBN 978-989-8565-66-2, pages 119-126. DOI: 10.5220/0004602501190126

@conference{sem13,
author={Rachid Aknouche. and Ounas Asfari. and Fadila Bentayeb. and Omar Boussaid.},
title={Integration Process for Multidimensional Textual Data Modeling},
booktitle={Proceedings of the 1st International Workshop in Software Evolution and Modernization - Volume 1: SEM, (ENASE 2013)},
year={2013},
pages={119-126},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004602501190126},
isbn={978-989-8565-66-2},
}

TY - CONF

JO - Proceedings of the 1st International Workshop in Software Evolution and Modernization - Volume 1: SEM, (ENASE 2013)
TI - Integration Process for Multidimensional Textual Data Modeling
SN - 978-989-8565-66-2
AU - Aknouche, R.
AU - Asfari, O.
AU - Bentayeb, F.
AU - Boussaid, O.
PY - 2013
SP - 119
EP - 126
DO - 10.5220/0004602501190126

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.