Authors:
Corinna Giebler
;
Christoph Stach
;
Holger Schwarz
and
Bernhard Mitschang
Affiliation:
Institute for Parallel and Distributed Systems, University of Stuttgart, Universitätsstraße 38, D-70569 Stuttgart and Germany
Keyword(s):
Big Data, IoT, Batch Processing, Stream Processing, Lambda Architecture, Kappa Architecture.
Abstract:
The Internet of Things is applied in many domains and collects vast amounts of data. This data provides access to a lot of knowledge when analyzed comprehensively. However, advanced analysis techniques such as predictive or prescriptive analytics require access to both, history data, i. e., long-term persisted data, and real-time data as well as a joint view on both types of data. State-of-the-art hybrid processing architectures for big data—namely, the Lambda and the Kappa Architecture—support the processing of history data and real-time data. However, they lack of a tight coupling of the two processing modes. That is, the user has to do a lot of work manually in order to enable a comprehensive analysis of the data. For instance, the user has to combine the results of both processing modes or apply knowledge from one processing mode to the other. Therefore, we introduce a novel hybrid processing architecture for big data, called BRAID. BRAID intertwines the processing of history dat
a and real-time data by adding communication channels between the batch engine and the stream engine. This enables to carry out comprehensive analyses automatically at a reasonable overhead.
(More)