L4SDD not only defines an output data format but
also specifies how the data is to be calculated from
the various sensor inputs. The converted data can later
be directly stored in databases and processed by data
analysis techniques. The main goal of this paper is to
present this language and the workbench around the
language in detail. Although the discussion is based
on our concrete approach and implementation, the
ideas and solutions are generic, thus they are also
useful in other environments.
2.3 Related Approaches
In order to achieve syntactic interoperability in cross-
industry projects, we need a common data format
understandable, readable and writable by all
participants. In order to reach this goal, some of the
existing approaches focus on defining a universal
format applicable in all scenarios and domains. In our
case, this is not enough, since even if we succeed in
identifying such a format, the capabilities of the
sensor devices are strongly limited, and they cannot
convert the data to the desired format by themselves.
Thus, our goal was twofold: (i) a format that can
describe the format of all data independently of its
domain, and (ii) a solution to transform the original
data to this standard form.
The Data Distribution Service (DDS) (OMG: Data
Distribution Service, 2015) is a popular data-centric
publish-subscribe protocol defined by OMG. It is
created to handle communication between the
participants, but it would need to create adapters for
IoT devices. DDS offers a standard to describe the
data format, but data transformation is not considered.
The OPC-Unified Architecture (OPC-UA) (OPC-
Unified Architecture, 2015) is a popular machine-to-
machine protocol for industrial automation. Its basic
idea is promising, however, at the current stage, it is
rather a pre-release standard than a working, platform
independent solution. Most of the issues come from
various, incomplete implementations.
The Sensor Markup Language (SenML) (Network
Working Group: Sensor Markup Language, 2013) is
created to describe sensor measurements and devices,
which could fit into our scenario, but SenML allows
to use XML, JSON, Concise Binary Object
Representation (CBOR) and Efficient XML
Interchange (EXI) formats only. This is not suitable
in our case because of the limitation of the sensors.
The Data Format Description Language (DFDL)
(Open Grid Forum: Data Format Description
Language, 2014) (McGrath et al., 2009) is perhaps the
nearest to provide a solution to our challenges. It is a
modeling language for describing general text and
binary data in a standard way. The schemas in DFDL
allow any text or binary data to be read from its native
format and written into a destination language. The
standard has several implementations available, and
it can be integrated with several system technologies.
Even by understanding its promising capabilities, we
could not use DFDL. The most important reason for
this is that DFDL implementations have a concrete
platform to apply the conversion on. In contrast, the
implementation platform of our data conversion
framework must be modifiable (e.g., instead of Java,
we should be able to switch to a JavaScript platform).
Moreover, we wanted to optimize the conversion and
to ensure its safety. By defining a new script language
with limited, but efficient features and creating an
environment around it (e.g., compiler, execution
framework), we could achieve these goals easier.
3 THE LANGUAGE FOR SENSOR
DATA DESCRIPTION
By creating L4SDD, a new, purpose developed script
language, our primary aim was to devise a dynamic
data description solution. Before discussing the
language, it is useful to introduce, how the data
processing algorithm is applied in our framework.
When the script is created, it is compiled to source
code to the target platform, currently to JavaScript,
but it is configurable. The generated data processor
function is then registered by the framework. Later, if
the framework receives a sensor data message, the
registered data processors are queried. All processors
have a filter that decides, whether the processor is
applicable for the data, or not. If the answer of the
filter is positive, the data transformation is applied.
L4SDD scripts consist of several sections: (i) an
Output definition that describes the format of the
output data; (ii) a Filter definition that is used, when
the framework tries to find scripts applicable to the
specific data; (iii) a Mapping definition that defines
the conversion itself, namely how the output data is
produced from the input data; (iv) the script may also
contain a Params definition, where additional
parameters can be passed (e.g., the current location),
which can affect the result of Mapping. These
parameters are not sent by the sensor to the
framework, instead, the framework appends the
information as an additional input parameter for the
script when it is executed.
The Output and Params sections use the same
language elements and syntax (they are static format
descriptions), while the Filter and Mapping sections