loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Ricardo Jiménez-Peris 1 ; Francisco Ballesteros 2 ; Ainhoa Azqueta 3 ; Pavlos Kranas 1 ; Diego Burgos 1 and Patricio Martínez 1

Affiliations: 1 LeanXcale, Campus de Montegancedo, Madrid and Spain ; 2 Universidad Rey Juan Carlos, Madrid and Spain ; 3 Universidad Politécnica de Madrid and Spain

Keyword(s): Loading, Extract-Transform-Load (ETL), Scalable Databases, NUMA Architectures, Database Appliance, Scalable Transactional Management.

Abstract: In this paper we discuss how we architected and developed a parallel data loader for LeanXcale database. The loader is characterized for its efficiency and parallelism. LeanXcale can scale up and scale out to very large numbers and loading data in the traditional way it is not exploiting its full potential in terms of the loading rate it can reach. For this reason, we have created a parallel loader that can reach the maximum insertion rate LeanXcale can handle. LeanXcale also exhibits a dual interface, key-value and SQL, that has been exploited by the parallel loader. Basically, the loading leverages the key-value API and results in a highly efficient process that avoids the overhead of SQL processing. Finally, in order to guarantee the parallelism we have developed a data sampler that samples data to generate a histogram of data distribution and use it to pre-split the regions across LeanXcale instances to guarantee that all instances get an even amount of data during loading, thus guaranteeing the peak processing loading capability of the deployment. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.133.131.168

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Jiménez-Peris, R.; Ballesteros, F.; Azqueta, A.; Kranas, P.; Burgos, D. and Martínez, P. (2019). Parallel Efficient Data Loading. In Proceedings of the 8th International Conference on Data Science, Technology and Applications - ADITCA; ISBN 978-989-758-377-3; ISSN 2184-285X, SciTePress, pages 465-469. DOI: 10.5220/0008318904650469

@conference{aditca19,
author={Ricardo Jiménez{-}Peris. and Francisco Ballesteros. and Ainhoa Azqueta. and Pavlos Kranas. and Diego Burgos. and Patricio Martínez.},
title={Parallel Efficient Data Loading},
booktitle={Proceedings of the 8th International Conference on Data Science, Technology and Applications - ADITCA},
year={2019},
pages={465-469},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0008318904650469},
isbn={978-989-758-377-3},
issn={2184-285X},
}

TY - CONF

JO - Proceedings of the 8th International Conference on Data Science, Technology and Applications - ADITCA
TI - Parallel Efficient Data Loading
SN - 978-989-758-377-3
IS - 2184-285X
AU - Jiménez-Peris, R.
AU - Ballesteros, F.
AU - Azqueta, A.
AU - Kranas, P.
AU - Burgos, D.
AU - Martínez, P.
PY - 2019
SP - 465
EP - 469
DO - 10.5220/0008318904650469
PB - SciTePress