Research Question: What does a DQ tool for the
user-friendly and collaborative specification of DQ
rules look like?
To answer the proposed research question, we de-
veloped and prototypically implemented ColDaQ, a
DQ tool for the graphical and collaborative specifi-
cation of DQ rules. ColDaQ offers a data-flow pro-
gramming environment for constructing DQ rules in
a user-friendly way. The low-code approach is well-
suited for supporting users from various backgrounds
to realize their DQ requirements (Altendeitering and
Schimmler, 2022). In this sense, our tool helps data
domain experts become self-reliant and empowers
them to integrate and operate DQ rules themselves.
As a result, we contribute to overcoming the typical
divide of DQ tasks between technical (e.g., data engi-
neers) and non-technical (e.g., domain experts) users
that complicates DQ management (Altendeitering and
Guggenberger, 2021).
For the prototypical implementation of ColDaQ,
we used Great Expectations (The Great Expectations
Team, 2020) and CINCO (Naujokat et al., 2018) as
technological bases. The qualitative evaluation of our
application was two-fold. In the first round, we inter-
viewed three experts from the DQ and user experience
domains to gain in-depth feedback. Subsequently, we
completed a second round of evaluation using a fo-
cus group discussion with ten participants working in
data-intensive jobs. Overall, we found that our tool
can ease the manual specification of DQ rules and in-
form the design of future DQ tools.
The remainder of this article is structured as fol-
lows. First, we describe the theoretical background
of our study regarding data management and quality,
DQ tools, and domain-specific languages in section
2. In section 3, we outline the conceptual approach of
our tool and describe its prototypical implementation.
We present and discuss the qualitative evaluation re-
sults for our tool in section 4. Finally, in section 5,
we describe the contributions of our study, highlight
limitations, and outline paths for future work.
2 BACKGROUND
2.1 Data Management & Data Quality
”You can’t do anything important in your company
without high-quality data” (p.1) (Redman, 2020).
(Redman, 2020) states that DQ forms an important
organizational success factor and is widely recog-
nized as an essential building block for organizational
agility and a driver of business innovation. For in-
stance, a high level of DQ has a positive influence on
the functioning of data-intensive applications (e.g., ar-
tificial intelligence) (Tebernum. et al., 2021; Gr
¨
oger,
2021), allows seamless business processes (Amadori
et al., 2020), and builds trust among partners in a data
ecosystem (Guggenberger et al., 2020).
To be considered high-quality, data needs to ful-
fill a ’fitness for use’, which is context-dependent
and defined by the data consumer (Wang and Strong,
1996). As a context-dependent and multi-dimensional
concept, a sufficient level of DQ can be difficult to
achieve. As a result, it has long been incorporated into
data management practices but is still considered a
significant issue (Wang, 1998; Gr
¨
oger, 2021). Several
frameworks, such as the Total Data Quality Manage-
ment (TDQM) framework, emerged to support and
ease DQ management. The TDQM framework out-
lines how to define, analyze, measure, and improve
DQ. For example, by specifying DQ rules and vali-
dating data against these rules (Altendeitering, 2021).
However, manual DQ management is complex, cum-
bersome, and error-prone and is often supported by
DQ tools (Altendeitering and Tomczyk, 2022; Al-
tendeitering and Guggenberger, 2021).
2.2 Data Quality Tools
A plurality of tools emerged from science and prac-
tice to support DQ management. We can distinguish
these tools in data preparation tools for correcting er-
roneous data, data measuring and monitoring tools for
data validation, and general-purpose tools, which of-
fer the most comprehensive set of DQ functionalities
(Ehrlinger and W
¨
oß, 2022; Altendeitering and Tom-
czyk, 2022). Over time, DQ evolved from an algo-
rithmic, IT-centric task to a joint effort involving mul-
tiple stakeholders from across the organization (Al-
tendeitering and Tomczyk, 2022). These develop-
ments raised new requirements for modern DQ tools,
which must offer collaborative functionalities and ac-
cessible user interfaces (Altendeitering and Guggen-
berger, 2021; Swami et al., 2020). However, estab-
lished tools often cannot fulfill these requirements and
lack support for inexperienced users and collabora-
tive approaches to DQ (Altendeitering and Tomczyk,
2022).
In this study, we aim to address this lack of estab-
lished tools and develop a collaborative, user-friendly
solution for defining DQ rules and validating data.
For this purpose, we extend the DQ tool Great Expec-
tations (GE) (The Great Expectations Team, 2020).
We decided to use GE as it is open-source, offers ex-
tensive documentation, and is well-established in the
field of DQ. It also has an active community and was
Towards a Low-Code Tool for Developing Data Quality Rules
23