Authors:
Joan Navarro
1
;
Ainhoa Azqueta-Alzuaz
2
;
Pablo Murta Baião Albino
2
and
José Enrique Armendáriz-Iñigo
2
Affiliations:
1
La Salle and Universitat Ramon Llull, Spain
;
2
Universidad Pública de Navarra, Spain
Keyword(s):
Cloud computing, Hadoop, MapReduce, Data consistency, SABI database.
Related
Ontology
Subjects/Areas/Topics:
Data Engineering
;
Data Management and Quality
;
Data Storage and Query Processing
;
Distributed and Mobile Software Systems
;
Grid, Peer-To-Peer, and Cluster Computing
;
Software Engineering
Abstract:
Cloud computing—implemented by tool suites like Amazon S3, Dynamo, or Hadoop—has been designed to overcome classical constraints of distributed systems (i.e. poor scale out, low elasticity, and static behaviour) and to provide high scalability when dealing with large amounts of data. This paper proposes the usage of Hadoop functionalities to efficiently (1) process financial data and (2) detect and correct errors from data repositories; in particular, the work is focused on the database SABI. There is a set of operations that performed with the distributed computation paradigm may increase the calculation performance.