loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: M. Al Hajj Hassan and M. Bamha

Affiliation: LIFO, Université d’Orléans, France

Keyword(s): Parallel DataBase Management Systems (PDBMS), Parallel joins, Data skew, Join product skew, GroupBy-Join queries, BSP cost model.

Related Ontology Subjects/Areas/Topics: Databases and Datawarehouses ; Distributed and Parallel Applications ; Internet Technology ; Web Information Systems and Technologies

Abstract: SQL queries involving join and group-by operations are fairly common in many decision support applications where the size of the input relations is usually very large, so the parallelization of these queries is highly recommended in order to obtain a desirable response time. The most significant drawbacks of the algorithms presented in the literature for treating such queries are that they are very sensitive to data skew and involve expansive communication and Input/Output costs in the evaluation of the join operation. In this paper, we present an algorithm that overcomes these drawbacks because it evaluates the ”GroupBy-Join” query without the need of the direct evaluation of the costly join operation, thus reducing its Input/Output and communication costs. Furthermore, the performance of this algorithm is analyzed using the scalable and portable BSP (Bulk Synchronous Parallel) cost model which predicts a linear speedup even for highly skewed data.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 44.200.196.114

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Al Hajj Hassan, M. and Bamha, M. (2007). AN OPTIMAL EVALUATION OF GROUPBY-JOIN QUERIES IN DISTRIBUTED ARCHITECTURES. In Proceedings of the Third International Conference on Web Information Systems and Technologies - Volume 2: WEBIST; ISBN 978-972-8865-77-1; ISSN 2184-3252, SciTePress, pages 246-252. DOI: 10.5220/0001281302460252

@conference{webist07,
author={M. {Al Hajj Hassan}. and M. Bamha.},
title={AN OPTIMAL EVALUATION OF GROUPBY-JOIN QUERIES IN DISTRIBUTED ARCHITECTURES},
booktitle={Proceedings of the Third International Conference on Web Information Systems and Technologies - Volume 2: WEBIST},
year={2007},
pages={246-252},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001281302460252},
isbn={978-972-8865-77-1},
issn={2184-3252},
}

TY - CONF

JO - Proceedings of the Third International Conference on Web Information Systems and Technologies - Volume 2: WEBIST
TI - AN OPTIMAL EVALUATION OF GROUPBY-JOIN QUERIES IN DISTRIBUTED ARCHITECTURES
SN - 978-972-8865-77-1
IS - 2184-3252
AU - Al Hajj Hassan, M.
AU - Bamha, M.
PY - 2007
SP - 246
EP - 252
DO - 10.5220/0001281302460252
PB - SciTePress