In an IIoT environment, data, services and func-
tions are stored and processed where they are needed,
in contrast to the traditional approach, in which the
data was manipulated to fit to the systems of a grown
ecosystem of information management systems, i.e.
characterized by the different levels of the automation
pyramid. The implementation of such data-driven ap-
proach requires new design patterns in order to com-
ply with these business needs. The business require-
ments might include an application of various dif-
ferent use-cases to same information or a context-
specific visualization of information depending on the
person regarding the data. A thorough list of concerns
and challenges resulting from this fact can be found in
(Jeschke et al., 2017).
After a successful implementation of IIoT the ac-
quisition of vast amounts of data can be accomplished
that finally leads to the presence of IBD. Depending
on the protocol, which is implemented in terms of the
IIoT application, the integrated information is present
in a rather structured form. In further steps this data
will be processed to enable a deeper analysis to ex-
tract insights and valuable information.
2.2 Industrial Big Data
One of the most important outcomes of emerging IIoT
is the generation of large data volumes centrally accu-
mulated and stored, which grows at an unprecedented
rate – this volatility of generation speed in data is
one of the major characteristics about IBD (Mourtzis
et al., 2016). According to (McKinsey, 2017), in 2010
manufacturing stored more data than any other sec-
tor – estimated two exabytes. To summarize the def-
inition of IBD, the basic characteristics of IBD are
the high volume, velocity, and variety of data (Laney,
2011); although new characteristics are being contin-
uously introduced with “value” being the most impor-
tant (Yin and Kaynak, 2015). In comparison with
BD, IBD usually has a higher data quality and is
more structured, more correlated, more orderly in
time and more prepared to extract insights (for both
low and advanced methods) (Lee et al., 2015). How-
ever, IBD has higher demands in terms of flexibility
and application-specific utilization of data.
According to (Kuschicke et al., 2017), the term
IBD stands not only for industrial data itself, but
also for the techniques and methods to utilize and
process the data. To underline this characterization,
the definition of Wilder-James fits quiet well: “Big
data is data that exceeds the processing capacity of
conventional database systems. The data is too big,
moves too fast, or does not fit the structures of your
database architectures. To gain value from this data,
you must choose an alternative way to process it. [. . . ]
To clarify matters, the three Vs of volume, velocity
and variety are commonly used to characterize dif-
ferent aspects of big data. They are a helpful lens
through which to view and understand the nature of
the data and the software platforms available to ex-
ploit them.” (Wilder-James, 2012) Thus, BD as well
as IBD solutions are build around this viewpoint on
Big Data, seeking for methods how to deal with the
characteristics of volume, velocity and variety.
In addition to this, IBD involves further methods
such as data acquisition, storage, and management
techniques. To make use of the gathered and con-
solidated data, methods originated from the domains
of data visualization, data mining, machine learn-
ing, and artificial intelligence are applied (Chen and
Zhang, 2014); (Gluchowski et al., 2007) and accord-
ingly complete the tool-set of IBD.
2.3 Industrial Big Data Architectures
The application of IBD techniques requires certain
guidance, as the process of gaining valuable knowl-
edge from industrial data involves several steps. For
optimal results (little effort, short delivery time, high
value) the different steps need to be performed in
close coordination. Therefore, different approaches
– from high level solutions up to detailed instructions
with examples – were developed and presented so far.
Unfortunately, the existing approaches do not com-
prehensively match the requirements and needs of the
manufacturing industry. The following chapter con-
tains a survey, which is not intended to be complete,
but should rather provide an overview about the main
streams in this research area. The examples provided
offer a rather high level and functional focus or target
generic solutions.
One high level architecture for BD is the so called
’big data pipeline’ published in (Bertino et al., 2011).
The authors describe a serial process of multiple
phases which are necessary steps to enable the anal-
ysis and accordingly the exposure of the hidden po-
tentials in the data. The pipeline consists of the
phases “acquisition / recording”, “extraction / clean-
ing / annotation”, “integration / aggregation / rep-
resentation”, “analysis / modeling” and “interpreta-
tion”. The pipeline gives readers a basic overview on
how to generally extract value from BD. However, the
presented approach does not describe how to link the
various phases in the pipeline, nor how to implement
the contents of the individual phases.
A more detailed approach has been developed by
the Industrial Internet Consortium (Industrial Internet
Consortium, 2017) which is referred to as the Indus-
Industrial Big Data: From Data to Information to Actions
139