ZeitGeist: A Generic Tool Supporting the Dissemination of Time Series

Data Following FAIR Principles

Andreas Schmidt

1,2 a

, Mohamad Anis Koubaa

1 b

, Jan Schweikert

1 c

, Karl-Uwe Stucky

1 d

Wolfgang S

uß

1 e

and Veit Hagenmeyer

1 f

Institute for Automation and Applied Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany

Department of Computer Science and Business Information Systems, University of Applied Sciences, Karlsruhe, Germany

ﬁ

Keywords:

FAIR Data, RO-Crate, Time Series Data, Export-Query-Conﬁgurator.

Abstract:

An important point for the widespread dissemination of FAIR-data is the lowest possible entry barrier for

preparing and providing data to other scientists according to the FAIR criteria. If scientists have to manually

extract, transform and annotate the data according to the FAIR criteria and then export it to make it available

to the public, this requires a signiﬁcant investment of time that does not primarily reward the scientist who

prepares and provides the data. The Energy Lab at KIT is running a large cluster of an Inﬂux database

management system with energy related time series data being stored in a variety of individual databases over

periods of up to 15 years. In order to increase the willingness to make data available to the scientiﬁc public,

we develop a tool that greatly supports and automates the publication and annotation process of time series

data stored in Inﬂux databases.

1 INTRODUCTION

The results of a survey published in Nature (Baker,

2016) revealed that more than 70% of the scientists

surveyed had tried to reproduce experiments of other

scientists and failed. The article also mentions other

studies in the ﬁeld of cancer research and psychology,

where it is estimated that only between 10% and 40%

of the experiments described can be reproduced. This

fact has serious consequences. On the one hand, it de-

creases the trust in publications and thus in science as

a whole, and on the other hand, it is a big waste of re-

sources if a lot of time has to be invested in verifying

results from other papers in order to be able to build

on these results afterwards.

One way to increase reproducibility is to make

the original data on which the experiments are based

available. This is now also being driven forward by

a number of research institutions around the world.

https://orcid.org/0000-0002-9911-5881

https://orcid.org/0000-0001-8552-2008

https://orcid.org/0000-0003-4774-2717

https://orcid.org/0000-0002-0065-0762

https://orcid.org/0000-0003-2785-7736

https://orcid.org/0000-0002-3572-9083

In Germany, for example, by the Helmholtz Associa-

tion of German Research Centers, the largest scien-

tiﬁc organization in Germany with over 44,000 em-

ployees and an annual budget of 5.8 billion euros

(as of 2020), which recently launched the Helmholtz

Metadata Collaboration (HMC) project. The goal of

HMC is to develop and establish novel methods and

tools to document research data using enriched meta-

data (HMC, 2023). Another large organisation in Ger-

many is the German National Research Data Infras-

tructure (NFDI) (NFDI, 2023). The NFDI aims to

create a permanent digital repository of knowledge as

an indispensable prerequisite for new research ques-

tions, ﬁndings and innovations. NFDI consortia, as-

sociations of various institutions within a research

ﬁeld, work together in an interdisciplinary manner

to implement the goal. An important consortium is

NFDI4Energy (NFDI4Energy, 2023),a national re-

search data infrastructure for the interdisciplinary en-

ergy system research.

In 2016, Wilkinson et. al. published a pa-

per (Wilkinson et al., 2016) in which they formally

described the FAIR Guiding principles, which had

been postulated for the ﬁrst time two years ear-

lier in a workshop in Leiden/Netherlands. FAIR

stands for Findable, Accessibility, Interoperability,

Schmidt, A., Koubaa, M., Schweikert, J., Stucky, K., Süß, W. and Hagenmeyer, V.

ZeitGeist: A Generic Tool Supporting the Dissemination of Time Series Data Following FAIR Principles.

DOI: 10.5220/0012254300003598

In Proceedings of the 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2023) - Volume 3: KMIS, pages 303-310

ISBN: 978-989-758-671-2; ISSN: 2184-3228

303

and Reusability of digital artifacts. A very cen-

tral idea is the description of science artifacts by

metadata. For this reason, it is not surprising that

the FAIR principles also play an important role in

HMC (Buttigieg et al., 2022) and NFDI4Energy

(NFDI4Energy, 2023). Section 3 summarizes the

most important aspects of FAIR.

An important point for the widespread dissemina-

tion of FAIR-data is a low entry barrier for prepar-

ing and providing data according to the FAIR criteria.

If scientists have to manually extract, transform and

annotate the data according to the FAIR criteria and

then export it to make it available to the public, this

requires a signiﬁcant investment of time that does not

primarily reward the scientist who prepares and pro-

vides the data.

In order to increase the willingness to make data

available to the scientiﬁc public, we develop Zeit-

Geist, a tool that greatly supports and automates the

publication and annotation process of time series data

stored in an Inﬂux database. The tool is developed in

the context of the Energy Lab 2.0 (ELAB, 2023).

The energy transition raises many questions:

How can energy be generated in an environmentally

friendly way and stored efﬁciently? What happens

when the sun does not shine and the wind does not

blow? And what happens if more electricity is sud-

denly needed? To answer these questions, the En-

ergy Lab 2.0 researches the intelligent interaction of

various options to generate, store, and supply energy.

As Europe’s largest research infrastructure for renew-

able energy, the Energy Lab 2.0 ﬁnds answers to all

these questions. There, the intelligent networking of

environmentally friendly energy generators and stor-

age methods are investigated. In addition, energy sys-

tems of the future are simulated and tested based on

real consumer data. A plant network links electrical,

thermal, and chemical energy ﬂows as well as new in-

formation and communication technologies. The re-

search aims at improving the transport, distribution,

storage, and use of electricity and thus creates the ba-

sis for the energy transition.

The Energy Lab 2.0 has a large cluster of an In-

ﬂux time series database, in which a wide variety of

energy-related data are stored in a large number of

individual databases over periods of up to 15 years.

These data in turn form the basis for a wide variety of

research projects like SEKO

(Sector Coupling), Liv-

ing Lab Energy Campus

, Kopernikus 2X

, and oth-

ers. In order to make the experiments performed at

KIT reproducible for research, it is necessary to make

https://www.esd.kit.edu/85.php

https://www.fz-juelich.de/de/llec

https://www.kopernikus-projekte.de/en/projects/p2x

these data available. So far, this has mostly been done

within git or DVC (DVC, 2023) repositories.

ZeitGeist is a web application consisting of a

backend service and an interactive frontend. The

backend provides arbitrary, predeﬁned and annotated

time series data of a measurement (an Inﬂux database

structure that corresponds to a table in a relational

database) via an URL without requiring any further

information for access. The speciﬁcation of the data

is undertaken via HTTP-GET parameters. These in-

clude the desired time interval and speciﬁc condi-

tions on the attributes as well as a conﬁguration ﬁle in

which the Inﬂux server access information is stored.

The actual request is made by a series of REST-

API (Inf, 2021) calls to the InfuxDB. In order to be

able to extract arbitrarily large amounts of data, a

stream-based approach was chosen. The data is re-

turned as an RO-Crate dataset (Soiland-Reyes et al.,

2022). The column data types are extracted from

metadata calls to the Inﬂux database (InﬂuxMeta,

2022). Further information about the attributes (like

quantity, unit), provided as metadata in the RO-Crate,

can be additionally speciﬁed in the conﬁguration ﬁle.

The frontend implements the interactive construc-

tion of the URL for reading out the time series data.

The ﬁrst step is to select the speciﬁc conﬁguration ﬁle

stored for a particular measurement, which contains

the information for accessing a speciﬁc database, etc.

This information is used, to access the measurement

and determine the time interval for which data is

available. Meta information of the measurement is

read out including the attributes with their data types.

In addition, for attributes which act as tags (descrip-

tive attributes), the existing tag values are extracted.

These attributes can be used to interactively formu-

late extraction conditions (e.g. only data of certain

buildings, devices, ...). Finally, the time interval of

the data to be extracted must be speciﬁed. The result

of this step is a URL, conforming to the backend API,

to export the data.

The rest of the paper is structured as follows:

Section 2 provides an overview of the characteristic

features of the InﬂuxDB database management sys-

tem. Section 3 explains the four FAIR guiding princi-

ples (Findable, Accessible, Interoperable, and Reuse),

which should apply to scientiﬁc data management

and stewardship. Section 4 introduces RO-Crate, a

lightweight approach to packaging research artifacts

along with their metadata in machine-readable form

in a container. Section 5 then introduces our tool

ZeitGeist, its architecture and internal functionality as

well as the conﬁguration possibilities. Section 6 con-

cludes the paper with a summary and a research out-

look.

KMIS 2023 - 15th International Conference on Knowledge Management and Information Systems

304

2 INFLUX DATABASE

The InﬂuxDB database management system is opti-

mized for storing and querying time series data.

Figure 1 shows the structure of a data set. Each

record consists of a mandatory timestamp, zero or

more tags, describing the dataset (e.g the location of a

sensor), and at least one ﬁeld for storing a value (e.g.

a sensor value). Furthermore, you can see that both

the timestamp and the optional tags have an index for

quick access, but the ﬁelds do not. This means that

for read requests, datasets can be quickly selected by

their tag values or grouped by them, but not by the ac-

tual measured values. The timestamp is represented

as RFC 3339 UTC timestamp, with nanosecond pre-

cision, all tags have the datatype string, while the

ﬁelds can have one of the datatypes float, integer,

boolean, or string. Each record is stored in a mea-

surement, which is an organizational element of the

database, similar to a table in a relational database.

In contrast to a relational table a measurement is not

based on a schema, so that in principle each record

can have its own ﬁelds and tags. For efﬁciency rea-

sons, the tag values for a dataset are not stored directly

with the dataset, but a hash value is determined for

this combination of tag values, which is then stored

with the dataset.

The characteristic of time series data, it’s chrono-

logical order, as well as the lack of transactional sup-

port and less query facilities compared to ie.g. SQL,

enables the database to perform a series of internal

optimizations that result in a much higher write and

read rate than would be possible with a multi-purpose

database.

Figure 1: InﬂuxDB record.

An interesting aspect of time series databases is

the retention policy. This speciﬁes how long the

data should be stored in the database. Records whose

age is greater than the retention policy are automati-

cally deleted by the server.

Continuous queries are closely linked to the re-

tention policy. These are executed cyclically by the

database system and are used to ”downsample” the

data records. This means that older data records are

stored in aggregated form before they have reached

their lifetime, as deﬁned by the retention policy.

An InﬂuxDB server can run on a single machine

as well as in a cluster. It supports sharding as well

as replication. An InﬂuxDB server hosts multiple

databases. Each database can have multiple measure-

ments, in which the data records are stored.

InﬂuxDB comes with a REST API (Inf, 2021).

This allows communication with the database from

almost any programming language. In addition, there

are a number of language bindings, all of which are

based on the REST API. InﬂuxDB currently supports

two query languages. One is InﬂuxQL, an SQL like

query language and the newer ﬂux query language

which works stream-based. InﬂuxQL also supports

the formulation of queries to the data dictionary, so

that information about the structure of the data can be

read out. This feature is used with the Inﬂux Exporter

developed by us.

3 FAIR PRINCIPLES

An important aspect of FAIR is the possibility of ma-

chine processing, since huge amounts of data, its con-

stant growth, and high data complexity make purely

manual processing impossible (Go-fair, 2022).

The principles formulated in the following do not

recommend any technologies, standards or imple-

mentation recommendations, but serve as guidelines

for possible implementations.

Findable: Data and metadata must be ﬁndable for

both humans and computers. For this purpose, the

data must be described by rich metadata. Further-

more, data and metadata must be identiﬁable by a

globally unique and persistent identiﬁer (PID). A

metadata record should refer to the record of the

described data by its PID. In order for data and

metadata to be found, they both must be registered

and indexed in a searchable resource.

Accessible: Data and metadata must be accessible

by its PID through a standardized communication

protocol that supports authorization and authenti-

cation. Even in the event that data is no longer

available, it should be possible to access at least

the metadata.

Interoperable: Metadata are described by a formal,

common, accessible and widely applicable lan-

guage for knowledge representation. Further-

more, it must be possible to qualitatively describe

relationships between the data sets, which makes

it necessary to identify the data sets according

to their PID. Example of such languages include

RDF, JSON-LD, or OWL.

Reusable: Metadata should be described by a variety

of precise and relevant attributes. This should help

ZeitGeist: A Generic Tool Supporting the Dissemination of Time Series Data Following FAIR Principles

305

the client (human, computer) to decide if the data

is relevant or not. Also the data and metadata are

provided with a unique and accessible data usage

license and with provenance information. Further,

if there are domain-speciﬁc standards or best prac-

tices for archiving and sharing data, they should

be followed.

4 RO-Crate

The FAIR principles presented above are described

independently of any implementation aspects and

leave a wide scope for interpretation. This Section

will speciﬁcally address how research objects (ﬁles,

workﬂows, ...) can be described using metadata. In

the ideal case, complete experiments can be repeated

on the basis of the data and associated metadata, thus

ensuring reproducibility and reusability. In our opin-

ion, one of the most promising approaches is RO-

Crate (Soiland-Reyes et al., 2022). It is a lightweight

approach to pack research artifacts together with their

metadata in a machine-readable form in a container.

This can be done, for example, through a zip archive

or a github repository. The semantic of the metadata

is described by schema.org vocabularies in JSON-

LD (JSON-LD, 2018) syntax.

The structure of a RO-Crate container consists of

the following artifacts:

Data entities are ﬁles that can either exist locally in

the container as bytestream, reference to external

ﬁles outside the container, or they are directories.

The data entites are described in more detail by

the contextual entities.

Contextual entities exist outside the container and

are stored inside the container only by their meta-

data, like a Person, referenced by their ORCID.

The root directory of the container contains the RO-

Crate metadata ﬁle (ro-crate-metadata.json),

which describes the contents of the RO-Crate, the

metadata and their relations to each other. The de-

scription is done in the linked data JSON-LD format.

In RO-Crate it is also possible to deﬁne so-called

proﬁles, which simplify the domain-speciﬁc use in the

sense that certain assumptions can be made about the

structure and content of the RO-Crates, thus facilitat-

ing programmatic use.

5 INFLUX EXPORTER

5.1 Architecture

ZeitGeist consists of a backend service and an interac-

tive frontend. The backend provides arbitrary, prede-

ﬁned and annotated time series data of a measurement

via an URL without requiring any further information

for access. The web-based frontend implements the

interactive construction of the URL for reading out

the time series data.

Figure 2 gives a high level overview of the in-

volved components. The ExportConfigurator (1)

allows the selection of a previously deﬁned conﬁgura-

tion ﬁle. In the conﬁguration, information for access-

ing the Inﬂux database (server, port, database, user,

password) as well as the measurement to be read out

are speciﬁed. Further optional speciﬁcations for de-

fault values are described in Section 5.2. An example

for a conﬁguration is given in the Listings 3 and 4.

After selecting a speciﬁc conﬁguration, this infor-

mation is used to perform a series of queries via the

InﬂuxDB REST API (3). One of the calls determines

the time interval within which data is available in the

measurement. Also, the meta information about the

measurement is read out. This includes the possi-

ble attributes (tags and ﬁelds) with their data types.

In addition, for tags (descriptive attributes), the ex-

isting values are extracted. These attributes can be

used to interactively formulate extraction conditions

(e.g. only data of certain buildings, devices, ...). Ad-

ditionally, the time interval of the data to be extracted

must be speciﬁed. The result of this step is a URL

(4), conforming to the backend API, to export the

data. The URL contains a number of HTTP GET

parameters that specify the desired data as well as

the conﬁguration ﬁle. An example of a generated

URL is shown in Listing 1. Beside the conﬁguration-

ﬁle kit.cn.buildings.tapwater.ini, the be-

gin and end of the time interval are speciﬁed

2019-09-23T02:00:00Z, 2019-10-23T02:00:00Z,

as well as a restriction on the tag building (0101 or

0121).

1 h ttp s :// zei t ge is t . cli ent s . iai . k i t . edu /←-

In fl ux E x p or te r . ph p ? conf i g = ki t . c n .←-

bu i ld in g s . tapw ate r . ini & start =2019 -09 - 2 3←-

T02 :00: 0 0 Z & end =2019 - 1 0 - 23 T 02 : 00 : 00 Z &←-

se le ct _ b u i l di ng [ ] =0 1 01 & sel ec t_ b u i l d in g←-

[] = 01 2 1

Listing 1: Generated URL.

The script behind the URL is

InfluxExporter (5). It is responsible for de-

livering the speciﬁed data as an RO-Crate object. The

program expects a number of key-value pairs, which

KMIS 2023 - 15th International Conference on Knowledge Management and Information Systems

306

Figure 2: General Architecture.

are delivered as HTTP GET parameter. The script

reads the conﬁguration ﬁle, speciﬁed in the URL

for obtaining the information to access the Inﬂux

database and then transforms the given parameter

to an InﬂuxQL query. The query for the URL in

Listing 1 can be seen in Listing 2.

1 s el e ct *

2 from " k it . cn . bui ldi ng s . ta p wa ter "

3 where time >= ’ 2 0 19 -09 -23 T02 :00: 00 Z ’

4 and time <= ’20 1 9 -10 -23 T02 : 0 0: 0 0 Z ’

5 and (" bu il d in g " = ’ 0 101 ’ or

6 " b u il din g " = ’0121 ’)

7 order by ti m e

Listing 2: Generated InﬂuxQL query.

5.2 Conﬁguration

A conﬁguration is normally split into two ﬁles. The

reason for this is that one conﬁguration is speciﬁc

to one measurement, but the connection informa-

tion for accessing the Inﬂux cluster is the same

for many measurements. In order to avoid having

the complete login information in each conﬁgura-

tion ﬁle, these are moved to a separate ﬁle and ref-

erenced from the measurement-speciﬁc conﬁguration

ﬁle. This has the particular advantage that if the ac-

cess information changes (e.g. the database is moved

to a new computer), the changes only have to be

made in one ﬁle. Listing 3 and 4 illustrate this fact.

The access information is located in Listing 3 (ﬁle:

ini-files/elab-ml4t.ini) and is included in line

3 of the measurement-speciﬁc conﬁguration ﬁle (List-

ing 4).

In addition to the settings already described in the

previous section, there are a number of further op-

tional settings. These are either default values, which

are displayed in the conﬁgurator (export time inter-

val), or information, which is transferred to the RO-

Crate container as meta information.

Unless otherwise speciﬁed, the entire time

interval during which data is available is dis-

played in the GUI as the default interval for

export. By setting one or more of the properties

default interval start, default interval end

and/or default interval range the default inter-

val to be exported can be customized. Beside absolute

timestamps in UTC format, also relative timestamps

using now(), a function supported by the InﬂuxDB,

can be used. For example, the property-setting

default_interval_start = "now() - 1h" re-

turns all datasets within the last hour. If only absolute

timestamps are used, the property cacheable can

be set to true. In this case, the returned data record

is also stored in a cache, so that in the case of

subsequent queries with the same conditions and

time interval, the query does not have to be sent to

the database again, but can be served directly from

the cache, thus relieving the database. However, this

only makes sense if it can be ensured that the data

has not changed in the meantime (which is typically

the case with historical data).

Further properties allow a more precise speciﬁ-

cation of the provided meta-information of the out-

put columns. While the data type (string, float,

integer, boolean, timestamp) of the individual

tags and ﬁelds are determined automatically by spe-

ZeitGeist: A Generic Tool Supporting the Dissemination of Time Series Data Following FAIR Principles

307

cial queries regarding the structure of the schema of

an Inﬂux database, further information such as ”quan-

tity” or ”unit” of a result column must then be deﬁned

by setting the corresponding properties. In Listing 4,

for example, starting from line 22 on, it is speciﬁed

that the quantity kind of ﬁeld value is ActiveEnergy

and furthermore that the unit is speciﬁed in kilowatt-

hours (KiloWHR in QUDT notation).

To provide information about the publisher (per-

son, organisation), the two properties ror and orcid

can be set in the conﬁguration ﬁle. These entries are

displayed as default values in the Export Conﬁgura-

tor’s GUI but can be overwritten. The same applies to

the property license.

1 s er v er = " https :// elab - i n flux - d1 . s e rve r . e l ab2 ←-

. kit . ed u :80 86"

2 use r na m e = " iai -ml4t -flx -r - 0 01"

3 pas s wo r d = "${ I A I _ M L 4T _F LX _R _0 01 }"

Listing 3: ﬁle ini-files/elab-ml4t.ini.

1 ; m an da t or y entr ies :

2 ;

3 con n e c ti on = ini - f ile s /elab - ml4 t .ini

4 dat a ba s e = fm_ ef fi c i o _ m i r r o r

5 mea su r em en t = e ff ic io _ ra w

7 ; o pt i on al e n tr i es :

8 ;

9 de f a u l t _ i n t e r v a l _ en d =" now () "

10 de f a u l t _ i n t e r v a l _ r a n g e = "1 h o ur "

11 des cr i pt io n =" ef f ic io _ ra w ( DB :←-

fm _e ff ic io _ m i r r o r )"

12 cac h ea bl e = f als e

14 [ ro -c rate ]

15 r o r =" ht t ps :// ror . o r g /04 t 3e n47 9 "

16 o rci d =" ht tps :// o r cid .org /000 0 -0002 -991 1 -5881"

17 lice nce =" ht t ps :// sp d x . org / lice n se s / CC - BY -4. 0 .←-

html "

19 ; o pt i on al d a ta t yp e in fo r ma ti on

20 ;

21 [ c o lu m ns ]

22 qua n ti t y [ v alu e ] = " h t tp :// qud t .or g / vocab /←-

qu an t i t yk in d / Ac ti ve E n e rg y "

23 u nit [ v alu e ] = " h t tp : // q udt . o r g /v o cab /u n it /←-

KiloW - HR "

Listing 4: Conﬁguration ﬁle.

5.3 Output Format

The export of a measurement is provided as RO-Crate

container. Figure 3 shows the content of the zipped

RO-Crate. The name of the ZIP ﬁle is formed from

the measurement name and the exported time interval.

Within the ZIP ﬁle there are three ﬁles. the

ﬁle data.csv contains the exported time series

Figure 3: Result RO-Crate object.

dataset. The ﬁle ro-crate-metadata.json con-

tains the metadata in JSON-LD format. Addition-

ally the ﬁle ro-crate-preview.html was created

and packed. This ﬁle is not mandatory and contains a

human-readable representation in HTML of the con-

tent of the ro-crate-metadata.json ﬁle.

Figure 4: Extract from ro-crate-metadata.json.

KMIS 2023 - 15th International Conference on Knowledge Management and Information Systems

308

CSVW (CSVW, 2017) syntax speciﬁcation is used

to describe the structure of the CSV ﬁle. In addition

to general aspects such as separators and the use of

quotation marks, csvw can be used to specify the data

types of the columns. Figure 4 is an excerpt from

the ro-crate-metadata.json file. It shows the

connection between the ﬁle data.csv, the schema

kit.cn.buildings.tapwater-... and the associ-

ated columns together with their data types, described

in csvw.

5.4 Implementation Aspects

The current prototype is developed as a proof of con-

cept in PHP and runs on an Apache server. The

cache is implemented using a simple ﬁlesystem di-

rectory. The streaming component used is realized

using the ZipStream-PHP-library (M

annchen, 2023)

from Jonatan M

annchen. It is planned to undertake a

complete reimplementation in Python and to make it

available as open source to the community, with the

hope to contribute to the wide spreading of the FAIR

principles. The future features discussed in Section 6

will also be implemented in this next version.

6 CONCLUSION AND OUTLOOK

The FAIR principles seek to ensure sustainable re-

search data management. By enriching data with

metadata, it should be possible for both humans and

computers to (1) ﬁnd relevant data (Findable), (2) ac-

cess it (Accessible), (3) integrate it with other data

(Interoperable), and (4) be able to decide (based on

the given metadata) if it could be used in different

contexts (Reuse).

In order to be able to publish research data in the

future very easily and without signiﬁcant time effort

according to the FAIR principles, we develop Zeit-

Geist that can greatly simplify the publication process

of Inﬂux time series data. ZeitGeist allows the conﬁg-

uration of the data to be exported and automatically

adds meta-information during the subsequent export,

according to the FAIR principles and following the

RO-Crate approach.

A re-implementation of the prototype in Python

and the availability as open source is planned. For this

version we have planned the following enhancements:

Currently, conﬁguration ﬁles are created directly

in a directory of the web server, which requires access

to the directory, but this is typically reserved for ad-

ministrators. In the new version, it should be possible

to create and administer conﬁguration ﬁles using the

GUI of the Export Conﬁgurator. By also integrating

a user/group management concept, the conﬁguration

ﬁles created in this way can then also be reserved for

speciﬁc persons/groups.

In the next version we also plan that not always

all attributes of the measurement will be exported, but

that the attributes to be exported can be speciﬁed.

Automatic resolution of ORCID and ROR: The

RO-Crate speciﬁcation requires that referenced con-

textual entities (metadata) should at least be described

with a name and type in the RO-Crate metadata ﬁle.

The reason for this is, that clients need not to follow

all links when displaying the provided information.

Another possible extension would be to allow the

export of different temporal resolutions of the data.

ACKNOWLEDGEMENTS

This publication was supported within the Hub

Energy of the Helmholtz Metadata Collaboration

(HMC), an incubator platform of the Helmholtz Asso-

ciation within the framework of the Information and

Data Science strategic initiative.

REFERENCES

Baker, M. (2016). Is there a reproducibility crisis? Nature,

pages 452–454.

Buttigieg, P. L., Curdt, C., Ihsan, A. Z., Jejkal, T., Kubin,

M., Mannix, O., Mohr, D. P., Pirogov, A., Port, B.,

and Stucky, K.-U. (2022). An interpretation of the

FAIR principles to guide implementations in the HMC

digital ecosystem. Projektbericht, Geomar, Kiel, Ger-

many.

CSVW (2017). CSVW Namespace Vocabulary Terms.

https://www.w3.org/ns/csvw.

DVC (2023). DVC Documentation. https://dvc.org/doc.

ELAB (2023). Welcome to the Energy Lab 2.0. https://

www.elab2.kit.edu/english/index.php.

Go-fair (2022). FAIR Principles. https://www.go-fair.org/

fair-principles/.

HMC (2023). Helmholtz Metadata Collaboration (HMC).

https://helmholtz-metadaten.de/.

Inf (2021). InﬂuxDB API reference. https://docs.inﬂuxdata.

com/inﬂuxdb/v1.8/tools/api/.

InﬂuxMeta (2022). Explore your schema using In-

ﬂuxQL. https://docs.inﬂuxdata.com/inﬂuxdb/v1.8/

query language/explore-schema/.

JSON-LD (2018). JSON-LD 1.1: A JSON-based Serial-

ization for Linked Data. https://www.w3.org/2018/

jsonld-cg-reports/json-ld/.

annchen, J. (2023). ZipStream-PHP. https://github.com/

maennchen/ZipStream-PHP.

NFDI (2023). German National Research Data Infrastruc-

ture (NFDI). https://nfdi.de/.

ZeitGeist: A Generic Tool Supporting the Dissemination of Time Series Data Following FAIR Principles

309

NFDI4Energy (2023). NFDI4energy Consortia. https:

//nfdi4energy.uol.de/.

Soiland-Reyes, S., Sefton, P., Crosas, M., Castro, L. J., Cop-

pens, F., Fern

andez, J. M., Garijo, D., Gr

uning, B., La

Rosa, M., Leo, S.,

O Carrag

ain, E., Portier, M., Triso-

vic, A., Community, R.-C., Groth, P., Goble, C., and

Peroni, S. (2022). Packaging research artefacts with

RO-Crate. Data Science, 5(2):97–138.

Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J., Apple-

ton, G., Axton, M., Baak, A., Blomberg, N., Boiten,

J.-W., da Silva Santos, L. B., Bourne, P. E., et al.

(2016). The FAIR Guiding Principles for scientiﬁc

data management and stewardship. Scientiﬁc data, 3.

KMIS 2023 - 15th International Conference on Knowledge Management and Information Systems

310