In the context of FAIR (Wilkinson et al.,
2016), the present work deals with the aspect of
(I)nteroperability, which intends the enrichment of
data with rich metadata.
Due to the exponentially growing amount of sci-
entific data, the enrichment with metadata must be au-
tomated by using suitable tools.
1.2 Example Scenario
In order to get a clearer picture of the requirements for
such software, a concrete scenario will be presented
here. The Institute for Automation and Applied In-
formatics (IAI) at KIT operates an experimental pho-
tovoltaic system (PV), consisting of a large number
of panels that are connected together as an array (in
parallel) or string (in series). Additional components
of this PV system include power inverters, batter-
ies, measuring equipment, and so on. An overview
of the concepts and their relationships are shown in
Figure 1. By reconfiguring the components, a large
number of experiments can be carried out in order to
achieve certain goals (e.g. maximum average elec-
tricity yield), or to observe the behavior under certain
effects (e.g. partial shading).
As part of an experiment that is carried out with
a specific interconnection of the components over a
specific time interval and under specific weather con-
ditions, a series of result data is generated that is
stored in a time series database. In order to be able
to interpret the measurement results, it is necessary to
know the components’ connectivity and the weather
condition at this time. This is an example for meta
information that has to be managed by our tool, to-
gether with the information on where the result values
are stored.
1.3 Requirements for a Metadata
Management Component
In order to fulfill this task, the software must have
a persistence component that allows the specific test
setup (the metadata) to be saved. Since we are talking
about ontology-based metadata the database schema
will be derived from the ontology. In addition, the
component must have an interface through which the
information can be entered. This can be, for exam-
ple, a simple web-based CRUD (Create, Read, Up-
date, and Delete) interface via which the individual
components of the system can be specified, or a dedi-
cated graphical editor with which the panels and other
components can be graphically created and linked to-
gether.
The software must also maintain the connection
between data and metadata. Data can be available in
a variety of formats, such as measurement data in a
time series database, relational data, parameter sets
of a learned neural network, as well as a variety of
proprietary data formats. Therefore, a component that
establishes this data-metadata connection is needed.
Such a tool can provide the data of the experi-
ment as well as the concrete setup (the metadata).
By preparing this information according to existing
standard formats, it is now able to export the data to-
gether with its metadata, or directly write it into a pre-
viously specified repository like the databus (Hoyer-
Klick et al., 2023). This repository than covers the
FAIR aspects (F)indable and (A)ccessible.
In contrast, in today’s reality, the information
about a specific experimental setup frequently is only
implicitly available in configuration files, installation
scripts or makefiles, which makes it almost impossi-
ble to extract this meta information.
The general functionality of the component just
described is therefore not only useful for the publi-
cation of semantically enriched FAIR data, but also
offers valuable services as an electronic notebook of
the experiments carried out.
In addition, it is not limited to the PV domain used
here as an example. The statements made here are
valid for any domain. While the general functional-
ity of the application is the same for all domains, the
structure of the metadata will be different. It depends
on the specific entities or concepts that describe the
specific application. These are described by CMs or
by the ontologies that describe the applications’ do-
mains.
1.4 Approach
And this brings us to the core idea of our research ap-
proach, the generation of an application, as described
in the previous section, on the basis of the available
domain information. We use a Model-Driven Soft-
ware Development (MDSD) approach (see Section 2
for details). The application to be realized is gener-
ated from a model description (the ontology) and a
number of transformation rules, which map the model
information to source code for a specific target plat-
form. In addition to the model information in the form
of an ontology, the generator can process further in-
formation such as a specific GUI layout or informa-
tion on the underlying software platform during the
generation process.
The rest of the paper is structured as follows:
Next, in Section 2 the basic terms and the methodol-
ogy of MDSD are presented. Then the concept of our
FAIRlead generator is presented in Section 3. Having
KMIS 2024 - 16th International Conference on Knowledge Management and Information Systems
324