An Empirical Examination of the Technical Aspects of Data Sovereignty

Julia Pampus

1 a

and Maritta Heisel

2 b

Institute for Software and Systems Engineering ISST, Dortmund, Germany

Paluno - The Ruhr Institute for Software Technology, University of Duisburg-Essen, Duisburg, Germany

Keywords:

Data Sharing, Data Sovereignty, Requirements Engineering, Empirical Study, Goal Modeling.

Abstract:

Self-determination and autonomy in data sharing, in recent research also referred to as data sovereignty,

arouses increasing interest in the context of industrial ecosystems. Its practical implementation considers

organisational, regulatory, legal, and particularly technical aspects. Previous work has not yet focused on the

structured analysis of technical characteristics of systems used in data sharing concerning data sovereignty. In

this paper, we therefore elicit what system requirements help the data sovereignty of a data sharing partici-

pant, starting from privacy protection goals, FAIR principles, and ISO/IEC 25010:2011. To address this, we

conducted a qualitative study in the form of an online questionnaire. We asked 18 domain experts to evaluate

selected system criteria for their impact and relevance to the implementation of data sovereignty. Our work

has resulted in a set of 22 functional requirements that can be used for designing data sharing systems. Subse-

quently, we discuss our ﬁndings, compare them with related work, and address further research.

1 INTRODUCTION

Nowadays, digital innovation and transformation ben-

eﬁt from data-driven value chains that include data

usage across company boundaries (Brauner et al.,

2022). A fundamental concept in this context is

the principle of data sovereignty. Data sovereignty

“refers to the self-determination [and autonomy] of

individuals and organisations with regard to the use

of their data” (Jarke et al., 2019, p.550). Recent work

focuses on general aspects of data sovereignty and

its conceptualisation within the scope of information

systems. It shows that the data infrastructure is crit-

ical for creating a trustworthy environment for data

sharing (von Scherenberg et al., 2024). Here, in ad-

dition to organisational, regulatory and legal require-

ments, especially technical requirements need to be

met by the data sharing participants.

Yet, there is an existing research gap in the

detailed examination of these technical require-

ments (Hellmeier et al., 2023). To address this, we

pose the following research question (RQ): What re-

quirements does a system have to fulﬁl to ensure the

data sovereignty of a data sharing participant?

In this context, a system refers to a software ap-

plication or a group of applications potentially de-

https://orcid.org/0000-0003-2309-6183

https://orcid.org/0000-0002-3275-2819

ployed on multiple infrastructures. To answer the

RQ, we conduct a qualitative empirical study and

analyse the impact of common system characteristics

on the assurance of data sovereignty. Analysing the

study results, we consider data sovereignty as a non-

functional requirement (NFR). The identiﬁed func-

tional requirements (FRs) are presented as goal mod-

els using the i* modelling notation.

The remainder of this paper is structured as fol-

lows: Section 2 introduces the foundations of our

work. Section 3 presents our research method, includ-

ing preparatory tasks, study design, and data collec-

tion and analysis. We then outline our study results in

Section 4, compare these to related work in Section 5,

and discuss them in Section 6. Finally, Section 7 sum-

marises our ﬁndings and the relevance of our work.

2 FUNDAMENTALS

This section introduces the fundamentals for under-

standing the following work.

2.1 Privacy Protection Goals

The protection goals for privacy engineering describe

generally applicable criteria for “the legal, techni-

cal, economic, and societal dimensions of privacy

112

Pampus, J. and Heisel, M.

An Empirical Examination of the Technical Aspects of Data Sovereignty.

DOI: 10.5220/0012760600003753

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 19th International Conference on Software Technologies (ICSOFT 2024), pages 112-122

ISBN: 978-989-758-706-1; ISSN: 2184-2833

and data protection in complex IT systems” (Hansen

et al., 2015, p.159). They are composed of three secu-

rity protection goals (conﬁdentiality, integrity, avail-

ability), known as the CIA triad, and three further

aspects that speciﬁcally address the issues of pri-

vacy and data protection (unlinkability, transparency,

intervenability). Unlinkability describes the prop-

erty that privacy-relevant data cannot be linked to

other privacy-relevant data beyond the context of use,

e.g., to conclude persons. Transparency means under-

standing all data movements, e.g., during processing,

at any time to reconstruct them. Intervenability is de-

ﬁned as the ability to observe and actively interrupt or

modify data processing (Hansen et al., 2015).

2.2 FAIR Principles

The FAIR principles deﬁne guidelines for the man-

agement of data and associated metadata for re-use by

third parties. First, (meta-) data should be ﬁndable for

users and systems. Next, the discovered data must be

accessible, including appropriate authentication and

authorisation. To use data for analyses, it must be in-

teroperable with other data or interfaces of systems.

Last, (meta-) data should be properly prepared so that

it can be reusable (Wilkinson et al., 2016).

2.3 Software Quality Characteristics

The ISO/IEC 25010:2011 (International Organization

for Standardization, 2011) provides software qual-

ity characteristics and describes an evaluation process

model. The model speciﬁes eight quality properties

with some sub-characteristics each: Functional suit-

ability means that a system works as expected and re-

quired in a speciﬁed context. The system could do

that with a certain performance efﬁciency and relia-

bility, and being compatible with other systems in the

same hardware or software environment. Next, us-

ability describes how effective, efﬁcient, and satisfy-

ing a user can interact with the system. The system

can be secure concerning conﬁdentiality, integrity,

non-repudiation, accountability, and authenticity. In

addition, its degree of maintainability, i.e., the abil-

ity to maintain functionality and the portability to an-

other hardware, software, or operational environment,

can be determined.

2.4 I* Modelling Notation

The i* modelling notation is a goal- and actor-

oriented framework for modelling requirements. It

focuses on actors, their intentions, and the strategics

to achieve goals (Dalpiaz et al., 2016). The language

consists of several entity types: actors, actor associa-

tions, intentional elements, intentional element links,

and social dependencies. Figure 1 visualises the ele-

ments used in our work: Each goal model has at least

one actor and an associated actor boundary, shown as

a grey circle in the background. There are different

actor types; we use Roles in the following. A Role

is an actor with an abstract characterisation within a

domain, in our case, data sharing.

Figure 1: Overview of i* Elements Used in This Work.

A goal model following the i* notation consists

of intentional elements. We use Goals, Qualities, and

Tasks. A Goal describes a state that an actor wants

to achieve. A Quality represents the desire of an ac-

tor. In many applications of the notation, this is an

NFR. Tasks describe actions that an actor performs to

achieve a Goal. Links connect intentional elements.

In our work, we only use links between Goals and

Qualities, referred to as Contribution, and links be-

tween Tasks and Goals, referred to as Reﬁnement.

There are four types of Contributions: make, help,

hurt, and break. Make means that an element alone

can ensure the fulﬁlment of a Quality; break can pre-

vent the fulﬁlment. Help and hurt denote general neg-

ative and positive inﬂuences. If a Reﬁnement has a

Goal as its parent, there can be an AND or OR re-

lationship. The arrows used in Figure 1 imply OR

relationships and allow the fulﬁlment of a parent with

“the fulﬁlment of at least one child” (Dalpiaz et al.,

2016, p.10) that is a ‘means’.

3 RESEARCH METHOD

To identify technical aspects of data sovereignty, we

conducted an empirical study. Figure 2 visualises its

research design. Our work involved four steps: First,

we designed the questionnaire and selected partici-

pants based on selection criteria. Second, we con-

ducted the study to, next, analyse the responses. Last,

the results of this analysis form a selection of require-

ments for the implementation of data sovereignty.

An Empirical Examination of the Technical Aspects of Data Sovereignty

113

Analysing the

Responses

Software

Requirements

to Achieve

Data

Sovereignty

Data Collection Data Analysis Results

SLR Selection of Terms

Study Design

Partcipant Selection

Survey Design

Coducting

the Survey

System

Characteristics

Selection

of Articles

Keyword

Extraction

Term

Frequency

Figure 2: Research Design for the Survey.

3.1 Preparatory Work

The pre-selection of system characteristics used for

our questionnaire required some preparatory work.

Therefore, as shown in Figure 2, the creation of the

questionnaire comprised the following steps: First,

we conducted a Systematic Literature Review (SLR)

to ﬁnd relevant articles. Next, we selected terms for

an analysis of these articles. Last, we used the most

frequent terms, i.e., system characteristics, as input

for the questionnaire design.

3.1.1 Systematic Literature Review

We built on the work of Hellmeier and von Scheren-

berg (2023) as a basis for our literature review since

they already provide an up-to-date literature review

on data sovereignty. In their work, the authors fo-

cussed on distinguishing data sovereignty from dig-

ital and technical sovereignty. For this, they con-

ducted an SLR and identiﬁed publications that “give

a concrete deﬁnition, discussion, implementation, or

explanation” (Hellmeier and von Scherenberg, 2023,

p.5) of the examined terms. In sum, 142 articles form

their ﬁnal result set, of which 51 deal with data sov-

ereignty. Figure 3 visualises the described process

(shaded grey). As we wanted to analyse the selected

articles automatically afterwards, we ﬁltered the 51

articles according to whether they deal with data sov-

ereignty in-depth and technically (inclusion criteria)

or only provide a brief insight or deﬁnition (exclusion

criteria). That resulted in a set of 29 articles of the

original 51. To add more recent ones to the dataset,

covering April 2022 to October 2023, we conducted a

Multivocal Literature Review, a form of an SLR that

includes grey literature (Garousi et al., 2019). Fol-

lowing Hellmeier and von Scherenberg (2023), we

considered scientiﬁc and industrial articles to exam-

ine the chosen subject from a practical point of view.

We used the following abstracted search string:

(Title: X OR Keywords: X OR Abstract: X),

X = “data sovereignty”

SLR (Hellmeier & von

Scherenberg, 2023)

Additional

SLR

21 Articles

IEEE 53 25

AISeL 4 7

ProQuest 49 32

ACM 14 13

∑ 142 84 (88)

51 Articles

ScienceDirect 22 11

Filtering + Selection

29 Articles

50 Relevant Articles

Data Sovereignty

Filtering + Selection

Figure 3: SLR Process for ‘Data Sovereignty’.

Figure 3 depicts the overall process. We searched

a total of ﬁve databases, listed there. The additional

search resulted in another 84 articles (without du-

plicates), of which 21 remained after ﬁltering using

the previously mentioned exclusion/inclusion criteria.

Overall, the result set of the combined SLR comprised

50 relevant articles for the subsequent data analysis.

3.1.2 Selection of Terms

As shown in Figure 2, the SLR was followed by a

selection of terms to analyse the found articles. For

the selection, we used existing system characteristics,

covering terminology from the speciﬁcations pre-

sented in Section 2. As data sovereignty relates to pri-

vacy and data protection, we added each privacy pro-

tection goal to the set of terms. Next, we considered

the FAIR principles as relevant, whereby ‘accessibil-

ity’ is subordinate to ‘availability’, as accessibility is

part of the availability of data or systems (Hansen

et al., 2015). Complementary, we adopted terms from

ISO/IEC 25010:2011 (International Organization for

Standardization, 2011). From the eight characteris-

tics, we omitted ‘functional suitability’ and ‘compati-

bility’ being too unspeciﬁc; ‘interoperability’ and ‘re-

liability’ are included in the FAIR principles. Here,

reliability is a part of integrity (Hansen et al., 2015).

We added the remaining ﬁve terms (performance, us-

ability, maintainability, portability, security) to the list

of relevant terms. We considered their child terms as

synonyms (if not already contained in the list, e.g.,

‘conﬁdentiality’ or ‘integrity’), not in a grammatical

sense but in their meaning.

For the selection of relevant system characteris-

ICSOFT 2024 - 19th International Conference on Software Technologies

114

tics of our questionnaire, ﬁrst, we scanned the avail-

able articles from our SLR for the frequency of the

above-selected terms. We assumed that the relevance

of terms increases with rising frequency in a text. We

used a Python script to analyse the articles by word

stems. Second, we conducted a reverse lookup to pre-

vent missing relevant terms during the pre-selection

in the previous processing. Accordingly, we searched

the articles for frequent terms using NLTK

and Key-

BERT (Grootendorst, 2020).

The overall result set included 15 system charac-

teristics: intervenability, transparency, conﬁdential-

ity, integrity, availability, (data) ﬁndability, interoper-

ability, reusability, performance (efﬁciency), usabil-

ity, security, maintainability, portability, trustworthi-

ness, and automation. The statistical evaluation of

the matches had already indicated one possible direc-

tion of our survey: Security was mentioned very often

in the analysed publications, while the searches for

reusability or portability did not result in any signiﬁ-

cant matches. A simple examination of the terms and

their meaning made us believe that the following char-

acteristics had no signiﬁcant inﬂuence on data sover-

eignty: maintainability, reusability, ﬁndability, porta-

bility, and automation. However, we included them in

the survey to minimise subjectivity in its design.

3.2 Survey Design

Online questionnaires offer a suitable way of inter-

viewing people in a targeted manner and without

much effort. There is no need for an interviewer, and

the participant can deal with the discussed subject in-

dependently of time and place. That can avoid unde-

sirable methodological effects, facilitate completion,

and thus increase data quality (Krosnick, 2018).

3.2.1 Questionnaire Design

We tested our questionnaire in a trial run to check the

formulations of the questions, how long it takes to

complete the questionnaire, and whether everything

works as expected from a technical point of view. As

a result, we had to make a few adjustments, includ-

ing the wording of the questions, which are already

incorporated below.

We divided the questionnaire into three parts and

55 questions, summarised in Table 1. At the be-

ginning, some opening words described the further

course of the survey and presented an established deﬁ-

nition of the term ‘data sovereignty’. That should help

to create the same knowledge base as a prerequisite

https://www.nltk.org (accessed on 2023-10-12)

for all participants and avoid any subsequent ambigu-

ities. In addition, we recorded the participant’s name

for possible follow-up contacts during the evaluation

phase (cf. Q01).

The ﬁrst part of the questionnaire consisted of

closed questions that allowed “respondents to select

an answer from a set of choices” (Krosnick, 2018,

p.266). It asked the participants to individually as-

sess the presented system characteristics and their im-

pact and relation to the implementation of data sov-

ereignty (cf. Q02-Q46). Here again, we provided

some deﬁnitions. To answer the guiding questions,

we used pairs of 7-point Likert scales with the op-

tions ‘strongly negative’, ‘slightly negative’, ‘neg-

ative’, ‘neutral’, ‘positive’, ‘slightly positive’, and

‘strongly positive’. The ﬁrst scale evaluated the exist-

ing or strongly positive characteristic (as appropriate);

the second assessed the opposite (missing or negative

characteristic). We offered an optional comment ﬁeld

to justify the selected answer.

The second part of the questionnaire comprised

open questions to identify other characteristics of a

system that we might have missed in the study design

(cf. Q47-Q49). In addition, a further comment ﬁeld

offered the opportunity to express additional thoughts

and opinions relevant to the questionnaire evaluation.

The third and ﬁnal part of the questionnaire asked for

general information about the participants for statis-

tical analyses, including the current job title, the cur-

rent employer, and the extent of experience with the

concept of data sovereignty (cf. Q50-Q55).

3.2.2 Participant Selection

The participants were a heterogeneous group of peo-

ple with different job positions and companies of var-

ious sizes based in Western Europe, primarily in Ger-

many. The main selection criterion was the knowl-

edge of the topic of data sovereignty and, thus, the

suitability to provide well-founded information. We

present an analysis of their ages, job positions, and

professional experiences in Section 4.1.

3.3 Data Collection

We requested the participants by personal e-mail. 18

out of 25 contacted persons agreed to take part in our

study. We provided the questionnaire with the help

of Microsoft Forms

. This tool offers the possibility

of simple participation without registration and easy

administration and analysis of responses.

https://forms.ofﬁce.com (accessed on 2024-02-26)

An Empirical Examination of the Technical Aspects of Data Sovereignty

115

Table 1: Shortened List of Questions.

ID Question (Q) Input Type

Q01 What is your full name? Text ﬁeld

Q02 How does the presence of X affect data sovereignty? Likert scale

Q03 How does the absence of X affect data sovereignty? Likert scale

Q04 Justify your answer. (optional) Text ﬁeld

. . . Repetition of Q02-Q04 with each X ∈ {“intervenability”, “transparency”, “conﬁden-

tiality”, “integrity”, “availability”, “data ﬁndability”, “interoperability”, “reusabil-

ity”, “performance efﬁciency”, “usability”, “security”, “maintainability”, “portabil-

ity”, “trustworthiness”, “automation”}

. . .

Q47 From your point of view, what other system characteristics have a positive impact on

the implementation of data sovereignty? Why?

Text ﬁeld

Q48 From your point of view, what other system characteristics have a negative impact on

the implementation of data sovereignty? Why?

Text ﬁeld

Q49 Is there anything else you would like to share that could be relevant for the evaluation? Text ﬁeld

Q50 How old are you? Choice box

Q51 Who is your current employer? Text ﬁeld

Q52 What is your current job title? Text ﬁeld

Q53 What is your professional background (education/studies/profession)? Text ﬁeld

Q54 How many years of work experience do you have? Text ﬁeld

Q55 Since when are you familiar with the concept of data sovereignty? Text ﬁeld

3.4 Data Analysis

We analysed the collected data in two ways: we sta-

tistically evaluated the inputs via Likert scales and the

answers to questions Q51 to Q55, and qualitatively

analysed all inputs via text ﬁelds.

For the interpretation of the results, we use goal

models. As described in Section 2.4, goal modelling

focuses on actors and expresses their intentions and

strategics. When analysing the requirements for a sys-

tem to implement data sovereignty, we consider the

desires of a data sharing participant as an actor who

uses the system. The created goal models help to de-

rive appropriate software requirements.

4 RESULTS

The following subsections present the study results

including the descriptive ﬁndings and our interpreta-

tion and analysis of the participants’ responses.

4.1 Descriptive Findings

In total, we interviewed 18 persons, aged between 25

and 44. They represented seven industrial compa-

nies and two research organisations. The distribution

shows that over 50 percent of the respondents have

a research and development (R&D) background. As

Table 2 shows, the participants have different profes-

sions but are all technically orientated, from software

development to research to mid-level management.

Around 75 percent of the respondents have completed

a higher academic degree, a signiﬁcant amount with a

specialisation in computer science or related subjects.

More than half have ten or more years of professional

experience, and all are familiar with data sovereignty

for at least one year, most even four years or more.

When presenting and discussing our results, we refer

to the participants and their citations using numbers

to ensure their anonymity.

Table 2: Overview of Survey Participants and Job Positions.

Job Position Participants (P)

Development P01, P05, P18

IT Management P04, P08-10, P15, P17

R&D P02-03, P06-07, P11-14, P16

The participants spent an average of 70 minutes

completing the questionnaire. As expected, most se-

lected system characteristics were rated positively in

their presence and negatively in their absence. Nev-

ertheless, there were also characteristics whose pres-

ence was rated positively and their absence was not

rated negatively, and therefore, according to our in-

terpretation, not considered a risk. Overall, we ob-

serve the following correlation: the more positive the

impact of the presence of a system characteristic, the

more negative its absence. The participants assessed

the presence of conﬁdentiality, integrity, security, and

trustworthiness as particularly relevant for the reali-

sation of data sovereignty and their absence as par-

ICSOFT 2024 - 19th International Conference on Software Technologies

116

ticularly risky. Also, the absence of interoperability,

transparency, availability, usability, and maintainabil-

ity was rated negatively, although not as much. The

assessment of intervenability was controversial. The

Likert scale ranges to either side for presence and ab-

sence. Findability, performance efﬁciency, portabil-

ity, and automation were considered ‘nice-to-have’,

i.e., positive in their presence but not critical in their

absence. The participants did not consider reusability

as relevant. The evaluation largely coincides with our

assumptions from Section 3.1.2.

All additionally collected characteristics (cf. Q47

and Q48) can be allocated to the previously selected

terms. For example, controllability, observability,

and modiﬁability belong to intervenability and trans-

parency. Most participants intensively used the com-

ment ﬁelds to justify and discuss their responses.

Thus, we derived the following requirements from

evaluating the Likert scales and from a qualitative

analysis of the comments.

4.2 Deﬁnition of Requirements

Figure 4 depicts a goal model focusing on the strate-

gic rationales for achieving data sovereignty. We see

‘data sovereignty’ and ‘trustworthiness’ as soft goals

as both are NFRs whose achievement is not clearly

deﬁned and measurable. We consider all previously

surveyed system characteristics as measurable NFRs,

i.e., as goals. The model illustrates that some goals

have a direct inﬂuence on data sovereignty (interven-

ability, security, interoperability), while others have

an indirect one by supporting other goals (integrity,

conﬁdentiality, usability, transparency, ﬁndability).

No goal has a make relation as no requirement can

achieve the soft goal independently.

Figure 4 highlights reusability, maintainability,

performance, portability, availability, and automation

white as we are not further considering them in the

derivation of FRs. Re-usability is a requirement that

cannot be externally assessed during an initial re-

quirements engineering process. Availability and its

sub-tasks are affecting elements on intervenability

with little relevance for the establishment of data sov-

ereignty. Automation can have a direct inﬂuence on

data sovereignty, either positive or negative, however,

it is more an extension of other FRs and does not stand

for itself.

The following subsections analyse each goal by

deﬁning strategics and tasks as goal models. We de-

rive one FR for the system under consideration from

each task. The set of FRs is listed in Table 3. The

formulation of the FRs uses MoSCoW prioritisation

(must, should, could, would). For simplicity, in the

Figure 4: Goal Model for Data Sovereignty.

following, we refer to the data sharing participant as

such and only concretise its role if necessary. For con-

cretisation, we will only use the terms data provider

and consumer. A data provider includes the data

rights holder, data providing agents, and third parties

like data intermediaries. A data consumer means data

consuming agent, also known as a data recipient or

user.

4.2.1 Intervenability

Intervenability is the “degree to which a system, prod-

uct or component prevents unauthorized access to, or

modiﬁcation of, computer programs or data” (Interna-

tional Organization for Standardization, 2011). Fig-

ure 5 depicts a goal model that illustrates the rela-

tions between intervenability, its means, and data sov-

ereignty.

Intervenability Helps or Hurts Data Sovereignty

(Cf. FR01-FR07). Intervenability is a “core con-

struct” (P08) for the self-determination of data shar-

ing participants. Both data providers and consumers

should be able to modify data ﬂows anytime. Yet,

there is some “potential for error and failure” (P09)

and a risk of misuse of provided intervenability mech-

anisms (P05; P08; P09). The data consumer must not

be able to bypass the data usage conditions deﬁned

by the data provider (P05). Therefore, mechanisms

for intervenability should not be applied by default

or without consent by involved data sharing partici-

pants (P10). If applicable, the data sharing partici-

pants should be able to negotiate the data usage con-

ditions.

An Empirical Examination of the Technical Aspects of Data Sovereignty

117

Figure 5: Goal Models for Helping Data Sovereignty with Intervenability (left) and Security (right).

Usability Is a Means of intervenability (Cf. FR08).

The better the usability of the applied systems, e.g.,

by providing a graphical user interface with a good

user experience, the easier it is to access essential

utilities such as the deﬁnition of data usage condi-

tions (P06; P07; P13). Missing or reduced usability

of interfaces, e.g., due to complexity, also on a tech-

nical level, “might hinder market adoption” (P10) or

increase the risk of incorrect use (P18).

Transparency Is a Means of Intervenability and

Helps Trustworthiness (Cf. FR09-FR10). Trans-

parency is the primary means of proving that data us-

age conditions are respected (P07; P12; P17), thus

also strengthening the trust of data sharing partici-

pants (P01; P05). That may be a must from a legal

perspective (P13); however, it is up to the data sharing

participants and their requirements to decide where

transparency is required (P09). During implementa-

tion, it is crucial to restrict access to logging or audit

trails, as transparency can otherwise quickly result in

abuse of conﬁdential information (P10; P18).

Findability Is a Means of Transparency (Cf.

FR11-FR13). Data sovereignty and transparency

require data to be traceable (P12) and thus ﬁndable.

Most importantly, allocating data helps to establish

references to existing data usage conditions (P05;

P06). In this context, it is essential to distinguish be-

tween internal and external ﬁndability. Internal ﬁnd-

ability is achieved, e.g., by unique and persisted iden-

tiﬁers, whereas external ﬁndability, in the sense of

discoverability of data offerings, must remain under

the control of the data provider (P09; P10).

4.2.2 Security

Security is the “degree to which a product or sys-

tem protects information and data so that persons or

other products or systems have the degree of data ac-

cess appropriate to their types and levels of authoriza-

tion” (International Organization for Standardization,

2011). Figure 5 depicts a goal model that illustrates

the relations between security, its means, and data

sovereignty.

Security Helps Data Sovereignty and Trustworthi-

ness (Cf. FR14-FR15). Security helps the trustwor-

thiness of a system and, with this, affects the data

sovereignty of a data sharing participant (P05; P09).

However, only selected requirements can have a di-

rect impact. For instance, a system should be able

to verify the identity of a data sharing participant or

guarantee the indisputability of actions. In addition to

conﬁdentiality and integrity, all other security require-

ments depend on the type of data, the respective use

case, and the data usage conditions (P02; P17; P18).

Accountability, e.g., is often a legal aspect (P18).

Conﬁdentiality Is a Means of Security and Helps

Trustworthiness (Cf. FR16-FR17). Conﬁdential-

ICSOFT 2024 - 19th International Conference on Software Technologies

118

Table 3: List of FRs.

ID Functional Requirements (The system . . . )

FR01 . . . SHOULD enable data sharing participants to modify data (ﬂows).

FR02 . . . MUST ensure that data (ﬂow) modiﬁcations are consistent with the data usage conditions.

FR03 . . . SHOULD enable a data provider to interrupt data processing activities on the data consumer side.

FR04 . . . COULD enable a data provider to execute operations on shared data on the data consumer side.

FR05 . . . MUST prevent interventions by third parties without the consent of the data sharing participants.

FR06 . . . SHOULD provide notiﬁcations if data usage does not comply with the data usage conditions.

FR07 . . . COULD enable data sharing participants to negotiate data usage conditions.

FR08 . . . COULD provide a graphical user interface to lower the barriers for non-experts.

FR09 . . . SHOULD provide features to keep track of data processing activities and data lineage at any time.

FR10 . . . MUST ensure that data usage conditions are accessible to all data sharing participants.

FR11 . . . SHOULD provide features to enrich any data with metadata, at least the data usage conditions.

FR12 . . . MUST assign a system-wide persisted unique identiﬁer to each data set.

FR13 . . . COULD provide features that enhance the discoverability of data.

FR14 . . . MUST provide mechanisms to ensure the indisputability of occurring events and actions.

FR15 . . . MUST incorporate mechanisms to authenticate and verify the identity of data sharing participants.

FR16 . . . MUST ensure access to data and metadata only by authorised actors, i.e., systems and users.

FR17 . . . MUST enforce the conﬁdential handling of data following the agreed upon data usage conditions.

FR18 . . . MUST prohibit changes to data by a data sharing participant without the data provider’s consent.

FR19 . . . MUST not remove any reference to the data origin without explicit consent of the data provider.

FR20 . . . COULD implement common data formats to facilitate data transfers.

FR21 . . . MUST implement common vocabularies for data usage conditions.

FR22 . . . SHOULD implement common protocols for data sharing.

ity is crucial for building trust (P01) and positively

impacts data sovereignty, as it prevents the general

misuse of data by third parties. Accordingly, data

sharing participants are always well-advised to act ac-

cording to the “Need-to-Know” (P08) principle. Nev-

ertheless, conﬁdentiality only needs to be ensured to

the extent required by the data sovereign (P04; P12;

P14). For example, when it comes to open data, con-

ﬁdential handling would not be part of the data usage

conditions and thus irrelevant.

Integrity Is a Means of Security and Helps Trust-

worthiness (Cf. FR18-FR19). Integrity is critical

when implementing security and trust (P05), as “no

secure and/or trustworthy environment can be created

on a compromised system” (P06). In this context, it

is particularly important that data modiﬁcations and

the removal of the data origin must not be executed

without prior consent.

4.2.3 Interoperability

Interoperability is the “degree to which two or more

systems, products or components can exchange in-

formation and use the information that has been ex-

changed” (International Organization for Standard-

ization, 2011). Figure 6 depicts a goal model that

illustrates the relations between interoperability, its

means, and data sovereignty.

Figure 6: Goal Model for Helping Data Sovereignty with

Interoperability.

Interoperability Helps Data Sovereignty (Cf.

FR20-FR22). Interoperability “enables the capabil-

ity to conduct [. . . ] data sovereignty” (P04). It is a

fundamental condition for data sharing. Concerning

data sovereignty, interoperability and thus the use of

mutual communication protocols, including deﬁned

processes and a common vocabulary, ensures a stan-

dardised understanding among all data sharing partic-

ipants (P02; P15). That allows for an equal harmoni-

sation, interpretation, and implementation of data us-

An Empirical Examination of the Technical Aspects of Data Sovereignty

119

age conditions (P05; P06). At the same time, inter-

operability can increase the system’s scalability, efﬁ-

ciency, and automation (P15).

5 RELATED WORK

Recent research papers have already dealt with the ex-

ploration of requirements for data sovereignty in in-

dustrial data sharing, also using empirical methods.

For example, Biehs and Stilling (2024) and Hellmeier

et al. (2023) conducted interview studies to identify

requirements for data sharing and, in this context, pri-

marily considered the implementation of data sover-

eignty. Nevertheless, most works have one thing in

common: they focus on speciﬁc aspects determined

by the research domain and inﬂuenced by certain use

cases. For instance, Opriel et al. (2021) concentrate

on the exchange of sensitive data. The work of Lar-

rinaga (2022) considers data sovereignty from a man-

ufacturing perspective and equates it with usage con-

trol. We addressed this by asking people from various

industrial domains, from energy, mobility and logis-

tics, manufacturing and automotive to healthcare. By

focusing on common system characteristics and de-

riving system features, not the content of data usage

conditions, we demonstrate that establishing data sov-

ereignty is not only about implementing access and

usage control.

Furthermore, a lot of works mix different per-

spectives (economic, regulatory, legal, and technical)

or provide varying levels of requirements that they

do not conclusively detail, such as the combination

of access control, GDPR (European Parliament and

Council of the European Union, 2016), data qual-

ity, and monetisation (Biehs and Stilling, 2024). Al-

ternatively, they elaborate on requirements that have

nothing to do with data sovereignty but focus on data

sharing, such as “short loading times” (Zrenner et al.,

2019, p.484) or data portability (Falc

ao et al., 2023).

As one of the ﬁrst, Hellmeier et al. (2023) follow a

holistic approach and separate the dimensions of data

sovereignty. However, the work also notes that the

elicited requirements from their interviewees are very

vague. Most of their answers reﬂect that people see

data sovereignty as equivalent to usage control or for-

mulate more general requirements for data sharing.

Our study also conﬁrmed this effect. It indicates a

lack of understanding or detailed analysis of what is

at the core of data sovereignty. Hence, domain experts

require tools to help them better understand their re-

quirements.

Ultimately, the existing studies primarily focus on

the requirements from the user’s perspective. Our

study extended these works by concluding implica-

tions for the system and corresponding system re-

quirements. Most importantly, we built on well-

established concepts like privacy and security for con-

sidering technical aspects of data sovereignty. In ad-

dition, we use previous approaches from related ﬁelds

to represent the requirements as part of goal models:

For example, Elahi and Yu (2007) have created a goal-

oriented approach for analysing security trade-offs;

Peixoto and Silva (2018) have used the i* goal mod-

elling notation to model privacy requirements; and

Borchert and Heisel (2021) have elaborated on how

to resolve trust conﬂicts using goal models.

Summarising, the presented ﬁndings complement

previous work with a closer examination of the tech-

nical aspects of data sovereignty and form a solid ba-

sis for structured requirements engineering for self-

determined and autonomous data sharing.

6 DISCUSSION

Initially, we raised the question of which system char-

acteristics have a particular inﬂuence on the imple-

mentation of data sovereignty of a data sharing parti-

cipant. The presented study demonstrates that a clear

answer to this question is highly controversial for a

reason. Referring to Section 5, while some work

states that security guarantees data sovereignty, oth-

ers argue that data sovereignty is addressed by imple-

menting access and usage control. Our assumption

that the existing literature is missing a detailed consid-

eration of data sovereignty from a technical perspec-

tive (cf. Section 1) is also reﬂected in the responses

of some of the study participants. While all system

properties were seen as essential for data sovereignty,

a closer examination reveals that the non-fulﬁlment

of most of them was not considered particularly neg-

ative. That leads to the question of how signiﬁcant

such characteristics can be for establishing data sov-

ereignty.

Also, the interviewees often assumed that the data

provider was the ‘data sovereign’. However, in the

concept of self-determination and autonomy in data

sharing, the consumer has the same rights as the data

provider. We have considered this in our interpreta-

tion of the survey results and generalised the derived

requirements in a way that respects the data provider

and the data consumer.

As the core result of our study, it is essential to

emphasise that many properties and functionalities of

a system support data sovereignty but only lead to a

successful implementation in their entirety. As shown

in Figure 4, there is no single make relation targeting

ICSOFT 2024 - 19th International Conference on Software Technologies

120

data sovereignty. Consequently, no requirement can

ensure the fulﬁlment of the data sovereignty require-

ment in isolation. Instead, it is the combination of

requirements that provides the technical foundation.

The proportions of the three emphasised characteris-

tics (intervenability, security, and interoperability) in

implementing a system for sovereign data sharing are

determined by the speciﬁc use case, e.g., the type of

data or its processing (P11). We deﬁne half of the 22

derived requirements as a must. Whether the other

requirements, or even more, should be fulﬁlled needs

to be deﬁned in the context, e.g., by an authority in a

data ecosystem or the data sharing participants them-

selves. In addition, trust plays a central role: many

deﬁned NFRs strengthen the trustworthiness of a sys-

tem and thus the data sovereignty of the data sharing

participants.

We used established system characteristics, in-

cluding ISO standards, to support our ﬁndings. With

this, we conﬁrmed that FRs should be embedded in

existing concepts closely related to security, privacy,

and trust. Besides the already existing NFRs, we

worked out system requirements that focus speciﬁ-

cally on data sovereignty and, with this, also data

sharing, such as FR02, FR07, and more. A particu-

lar contribution of our work is the creation of inter-

connections focussed on the needs of participants in

industrial data sharing.

6.1 Further Considerations

With the help of questions Q47 to Q49, we have al-

ready involved the study participants in discussing the

requirements for implementing data sovereignty. As

expected, supported by other studies (Hellmeier et al.,

2023; Biehs and Stilling, 2024), not only technical

aspects are relevant, but particularly regulatory and

legal aspects (P15), which immediately increase the

complexity of requirements elicitation (P16). In addi-

tion, activities such as standardisation (P18), the use

of open-source software (P11), and the establishment

of certiﬁcation processes (P11; P12) can reduce the

hurdles to interoperability and trustworthiness and,

therefore, the establishment of data sovereignty. After

all, not only the system must be trustworthy, but also

the actors involved in data sharing (P15).

From a technical perspective, it was suggested that

the system architecture, especially decentralised sys-

tems, could support the implementation of data sov-

ereignty (P09). It was argued that the involvement

of a centralised service necessarily leads to a loss of

sovereignty (P09). As a result, data sovereignty often

would not be implemented by a single system but by

many systems (P08) that require an equal fulﬁlment

of deﬁned requirements, which “may pose additional

challenges” (P05).

Ultimately, in terms of implementation, there is

always a cost-beneﬁt trade-off (P09). In addition,

the market situation (P14) that, e.g., forces a data

provider to share their data less restrictively can have

a considerable inﬂuence on the actual autonomy of

the data provider, as well as recent threats such as loss

or theft of digital identities (P10).

6.2 Limitations

Limitations to our work are mainly the selection of

study participants. First, a qualitative survey is lim-

ited to a small number of participants. Next, half of

our participants are employees in research. That is

because the topic of data sovereignty is currently in

the process of being transferred from research to the

industry. In addition, there is a possible professional

bias due to the topic of dataspaces or data ecosystems.

For instance, experts for data privacy will always re-

late data sovereignty to the guiding principles they are

familiar with, while experts for system security will

primarily emphasise aspects such as conﬁdentiality,

integrity, and availability. Last, all 18 participants are

located in Western Europe and are therefore biased by

the respective research and industry.

Moreover, our pre-selection of system character-

istics might have limited the results. Although the

questions for more characteristics did not result in ad-

ditional criteria, this could change with increasing the

number of interviewees. Additionally, the test inter-

view has already shown that the order of the questions

may affect the answers. The sequence of presence and

absence of the characteristics may lead to the natural

behaviour of making an opposite assessment.

6.3 Future Work

As a follow-up of the present study, the presented

models and requirements (cf. Section 4) should be

evaluated by the study participants as well as a larger

group of people. In addition, our presented work al-

lows for the further development of a structured re-

quirements engineering process, including the anal-

ysis of the functional requirements from Table 3. In

the course of this, it will be desirable to develop an ap-

proach for conﬂict resolution. Next, with the help of

goal modelling and other appropriate methods, stake-

holders can be guided through a requirements elicita-

tion process. Furthermore, the deﬁned requirements

can be used to simplify the derivation of a system de-

sign.

An Empirical Examination of the Technical Aspects of Data Sovereignty

121

7 CONCLUSION

In this work, we have focussed on the technical as-

pects of data sovereignty and the requirements for

its implementation by a system. We evaluated the

relevance of selected system characteristics with the

help of an empirical study and structured the FRs

and NFRs derived from this using goal models. Af-

terwards, we discussed our ﬁndings and compared

them to related work. Overall, we have empha-

sised that data sovereignty is not achieved by imple-

menting a deﬁnite list of system features but through

a combination of use-case-speciﬁc functional and

non-functional requirements. As one participant in

the study summarised, “[m]odern systems will have

[d]ata [s]overeignty by design” (P17). While build-

ing on privacy and security, our work has taken a

step towards a targeted requirements analysis and rea-

soned system design by extending research on self-

determination and autonomy in industrial data sharing

with a more technically reﬁned view.

ACKNOWLEDGEMENTS

This work was partially supported by the German

Federal Ministry for Economic Affairs and Climate

Action (funding number: 13IK004N). We thank

Daniel Tebernum for his valuable input and all par-

ticipants for contributing to our study.

REFERENCES

Biehs, S. and Stilling, J. (2024). Identiﬁcation of Key Re-

quirements for the Application of Data Sovereignty in

the Context of Data Exchange. In Proceedings of the

57th Annual Hawaii International Conference on Sys-

tem Sciences. ScholarSpace.

Borchert, A. and Heisel, M. (2021). Conﬂict Identi-

ﬁcation and Resolution for Trust-Related Require-

ments Elicitation A Goal Modeling Approach. J.

Wirel. Mob. Networks Ubiquitous Comput. Depend-

able Appl., 12(1):111–131.

Brauner, P., Dalibor, M., Jarke, M., Kunze, I., Koren, I.,

Lakemeyer, G., Liebenberg, M., Michael, J., Pen-

nekamp, J., Quix, C., Rumpe, B., van der Aalst, W.,

Wehrle, K., Wortmann, A., and Zieﬂe, M. (2022). A

Computer Science Perspective on Digital Transforma-

tion in Production. ACM Trans. Internet Things, 3(2).

Dalpiaz, F., Franch, X., and Horkoff, J. (2016). iStar 2.0

Language Guide. CoRR.

Elahi, G. and Yu, E. (2007). A Goal Oriented Approach

for Modeling and Analyzing Security Trade-Offs. In

Proceedings of the 26th International Conference on

Conceptual Modeling, pages 375–390. Springer.

European Parliament and Council of the European Union

(2016). Regulation (EU) 2016/679 of the European

Parliament and of the Council.

Falc

ao, R., Matar, R., Rauch, B., Elberzhager, F., and Koch,

M. (2023). A Reference Architecture for Enabling

Interoperability and Data Sovereignty in the Agricul-

tural Data Space. Information, 14(3):197.

Garousi, V., Felderer, M., and M

antyl

a, M. V. (2019).

Guidelines for including grey literature and conduct-

ing multivocal literature reviews in software engineer-

ing. Information and Software Technology, 106:101–

121.

Grootendorst, M. (2020). KeyBERT: Minimal keyword ex-

traction with BERT.

Hansen, M., Jensen, M., and Rost, M. (2015). Protection

Goals for Privacy Engineering. In 2015 IEEE Security

and Privacy Workshops. IEEE Computer Society.

Hellmeier, M., Pampus, J., Qarawlus, H., and Howar, F.

(2023). Implementing Data Sovereignty: Require-

ments & Challenges from Practice. In Proceedings

of the 18th International Conference on Availability,

Reliability and Security. ACM.

Hellmeier, M. and von Scherenberg, F. (2023). A Delimita-

tion of Data Sovereignty from Digital and Technolog-

ical Sovereignty. In Proceedings of the 31st European

Conference on Information Systems.

International Organization for Standardization (2011).

ISO/IEC 25010:2011, Systems and software engineer-

ing, Systems and software Quality Requirements and

Evaluation (SQuaRE), System and software quality

models. Standard.

Jarke, M., Otto, B., and Ram, S. (2019). Data Sovereignty

and Data Space Ecosystems. Business & Information

Systems Engineering, 61(5):549–550.

Krosnick, J. A. (2018). Questionnaire Design. In The Pal-

grave Handbook of Survey Research. Springer Inter-

national Publishing.

Larrinaga, F. et al. (2022). Data Sovereignty - Requirements

Analysis of Manufacturing Use Cases.

Opriel, S., M

oller, F., Burkhardt, U., and Otto, B. (2021).

Requirements for Usage Control based Exchange of

Sensitive Data in Automotive Supply Chains. In Pro-

ceedings of the 54th Hawaii International Conference

on System Sciences.

Peixoto, M. M. and Silva, C. (2018). Specifying privacy

requirements with goal-oriented modeling languages.

In Proceedings of the XXXII Brazilian Symposium on

Software Engineering. ACM.

von Scherenberg, F., Hellmeier, M., and Otto, B. (2024).

Data Sovereignty in Information Systems. Electronic

Markets, 34(1):1–11.

Wilkinson, M. D. et al. (2016). The FAIR Guiding Princi-

ples for scientiﬁc data management and stewardship.

Scientiﬁc Data, 3(1):1–9.

Zrenner, J., M

oller, F. O., Jung, C., Eitel, A., and Otto, B.

(2019). Usage control architecture options for data

sovereignty in business ecosystems. Journal of Enter-

prise Information Management, 32(3):477–495.

ICSOFT 2024 - 19th International Conference on Software Technologies

122