Scalable and Efficient Big Data Analytics - The LeanBigData Approach

Ricardo Jimenez, Marta Patino, Valerio Vianello, Ivan Brondino, Ricardo Vilaca, Jorge Teixeira, Miguel Biscaia, Giannis Drossis, Damien Michel, Chryssi Birliraki, George Margetis, Antonis Argyros, Constantine Stephanidis, Luigi Sgaglione, Gaetano Papale, Giavanni Mazzeo, Ferdinando Campanile, Marc Sole, Victor Muntés-Mulero, David Solans, Alberto Huelamo, Pavlos Kranas, Dora Varvarigou, Vrettos Moulos, Fotis Aisopos

Abstract

One of the major problems in enterprise data management lies in the separation of databases between operational databases and data warehouses. This separation is motivated by the different capabilities of OLTP and OLAP data management systems. Due to this separation copies from the operational databases to the data warehouses should be performed periodically. These copies are performed by a process call Extract-Transform-Load (ETL) that turns out to amount to 80% of the budget of performing business analytics. LeanBigData main goal has been to address this major pain by providing a real-time big data platform providing both functions, OLTP and OLAP, in a single data management solution. The way to achieve this goal has been to leverage an ultra-scalable OLTP database, LeanXcale, and develop a new OLAP engine that works directly over the operational data. The platform is based on a novel storage engine that provides extreme levels of efficiency. The platform has also an integrated parallel-distributed CEP that scales the processing of streaming data and that can be combined with the processing of data at rest at the new OLTP+OLAP database to address a wide variety of data management problems. LeanBigData has a bigger vision and aims at providing and end-to-end analytics platform. This platform provides a visual workbench that enables data scientist to perform discovery of new insights. The platform is also enriched with a subsystem that performs anomaly detection and root cause analysis that works with the new developed system and enables to perform this analysis over streaming data. The LeanBigData platform has been validated by four real-world use case scenarios cloud data centre monitoring, fraud detection in direct debit operations, sentiment analysis in social networks and targeted advertisement.

Download


Paper Citation


in Harvard Style

Jimenez R., Patino M., Vianello V., Brondino I., Vilaca R., Teixeira J., Biscaia M., Drossis G., Michel D., Birliraki C., Margetis G., Argyros A., Stephanidis C., Sgaglione L., Papale G., Mazzeo G., Campanile F., Sole M., Muntés-Mulero V., Solans D., Huelamo A., Kranas P., Varvarigou D., Moulos V. and Aisopos F. (2016). Scalable and Efficient Big Data Analytics - The LeanBigData Approach.In European Space project on Smart Systems, Big Data, Future Internet - Towards Serving the Grand Societal Challenges - Volume 1: EPS Rome 2016, ISBN 978-989-758-207-3, pages 92-111. DOI: 10.5220/0007903100920111


in Bibtex Style

@conference{eps rome 201616,
author={Ricardo Jimenez and Marta Patino and Valerio Vianello and Ivan Brondino and Ricardo Vilaca and Jorge Teixeira and Miguel Biscaia and Giannis Drossis and Damien Michel and Chryssi Birliraki and George Margetis and Antonis Argyros and Constantine Stephanidis and Luigi Sgaglione and Gaetano Papale and Giavanni Mazzeo and Ferdinando Campanile and Marc Sole and Victor Muntés-Mulero and David Solans and Alberto Huelamo and Pavlos Kranas and Dora Varvarigou and Vrettos Moulos and Fotis Aisopos},
title={Scalable and Efficient Big Data Analytics - The LeanBigData Approach},
booktitle={European Space project on Smart Systems, Big Data, Future Internet - Towards Serving the Grand Societal Challenges - Volume 1: EPS Rome 2016,},
year={2016},
pages={92-111},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007903100920111},
isbn={978-989-758-207-3},
}


in EndNote Style

TY - CONF

JO - European Space project on Smart Systems, Big Data, Future Internet - Towards Serving the Grand Societal Challenges - Volume 1: EPS Rome 2016,
TI - Scalable and Efficient Big Data Analytics - The LeanBigData Approach
SN - 978-989-758-207-3
AU - Jimenez R.
AU - Patino M.
AU - Vianello V.
AU - Brondino I.
AU - Vilaca R.
AU - Teixeira J.
AU - Biscaia M.
AU - Drossis G.
AU - Michel D.
AU - Birliraki C.
AU - Margetis G.
AU - Argyros A.
AU - Stephanidis C.
AU - Sgaglione L.
AU - Papale G.
AU - Mazzeo G.
AU - Campanile F.
AU - Sole M.
AU - Muntés-Mulero V.
AU - Solans D.
AU - Huelamo A.
AU - Kranas P.
AU - Varvarigou D.
AU - Moulos V.
AU - Aisopos F.
PY - 2016
SP - 92
EP - 111
DO - 10.5220/0007903100920111