PARALLEL PROCESSING OF ”GROUP-BY JOIN” QUERIES ON SHARED NOTHING MACHINES

M. Al Hajj Hassan; M. Bamha

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

PARALLEL PROCESSING OF ”GROUP-BY JOIN” QUERIES ON SHARED NOTHING MACHINES

Topics: Load Balancing

In Proceedings of the First International Conference on Software and Data Technologies - Volume 1: ICSOFT, 301-307, 2006 , Setúbal, Portugal

Authors: M. Al Hajj Hassan and M. Bamha

Affiliation: LIFO, Université d’Orléans, France

Keyword(s): PDBMS,Parallel joins, Data skew, Join product skew, GroupBy-Join queries, BSP cost model.

Related Ontology Subjects/Areas/Topics: Energy and Economy ; Load Balancing in Smart Grids ; Smart Grids

Abstract: SQL queries involving join and group-by operations are frequently used in many decision support applications. In these applications, the size of the input relations is usually very large, so the parallelization of these queries is highly recommended in order to obtain a desirable response time. The main drawbacks of the presented parallel algorithms that treat this kind of queries are that they are very sensitive to data skew and involve expansive communication and Input/Output costs in the evaluation of the join operation. In this paper, we present an algorithm that minimizes the communication cost by performing the group-by operation before redistribution where only tuples that will be present in the join result are redistributed. In addition, it evaluates the query without the need of materializing the result of the join operation and thus reducing the Input/Output cost of join intermediate results. The performance of this algorithm is analyzed using the scalable and portable BSP (Bulk Synchronous Parallel) cost model which predicts a near-linear speed-up even for highly skewed data. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 3.135.183.200

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Al Hajj Hassan, M. and Bamha, M. (2006). PARALLEL PROCESSING OF ”GROUP-BY JOIN” QUERIES ON SHARED NOTHING MACHINES. In Proceedings of the First International Conference on Software and Data Technologies - Volume 1: ICSOFT; ISBN 978-972-8865-69-6; ISSN 2184-2833, SciTePress, pages 301-307. DOI: 10.5220/0001316003010307

@conference{icsoft06,
author={M. {Al Hajj Hassan}. and M. Bamha.},
title={PARALLEL PROCESSING OF ”GROUP-BY JOIN” QUERIES ON SHARED NOTHING MACHINES},
booktitle={Proceedings of the First International Conference on Software and Data Technologies - Volume 1: ICSOFT},
year={2006},
pages={301-307},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001316003010307},
isbn={978-972-8865-69-6},
issn={2184-2833},
}

TY - CONF

JO - Proceedings of the First International Conference on Software and Data Technologies - Volume 1: ICSOFT
TI - PARALLEL PROCESSING OF ”GROUP-BY JOIN” QUERIES ON SHARED NOTHING MACHINES
SN - 978-972-8865-69-6
IS - 2184-2833
AU - Al Hajj Hassan, M.
AU - Bamha, M.
PY - 2006
SP - 301
EP - 307
DO - 10.5220/0001316003010307
PB - SciTePress