Rigor in Applied Data Science Research Based on DSR:

A Literature Review

Daniel Szafarski

, Tobias Schmieg

, Laslo Welz

and Thomas Sch

affer

Institute of Information Systems, University of Applied Sciences Heilbronn,

Max-Planck-Str. 39, 74081 Heilbronn, Germany

Keywords:

Design Science Research, Data Science, Rigor, Concept Matrix, Literature Review.

Abstract:

Design Science Research (DSR) enjoys increasing popularity in the ﬁeld of information systems due to its

practical relevance and focus on design. A study from 2012 shows that DSR publications in general have a

weak rigor in connection with the selection and use of research methods. At the same time, there has also

been a recent increase in Data Science publications based on the paradigm of DSR. Therefore, this study

analyzes the rigor and the speciﬁc characteristics of the application of DSR based on 62 publications from

this ﬁeld. Major deﬁcits are observed in a large part of the sample regarding the rigorous documentation of

the scientiﬁc process as well as the selection and citation of adequate research methods. Overall 77.4% of

the analyzed publications were therefore characterized as weak in regard to their rigor. One explanation is the

novel combination of DSR and Data Science together with the speed at which new ﬁndings are obtained and

published.

1 INTRODUCTION

As a result of the digital transformation, Data Science

has developed into a trending topic that has recently

become important and is discussed in both, theory

and practice (Jordan and Mitchell, 2015). Due to the

increasing volume of available data and the easy ap-

plicability of machine learning algorithms in various

areas, this technology is key to the digital transfor-

mation of a company. This leads to major productiv-

ity increases through automation (Goes, 2014; Abbasi

et al., 2016). Moreover, researchers emphasize the fo-

cus on IT artifacts and their analysis, which is partic-

ularly relevant for business and society (Benbasat and

Zmud, 2003; Saar-Tsechansky, 2015; Abdel-Karim

et al., 2021).

Considering the research focus of information sys-

tems in the ﬁeld of Data Science Research, it is not

surprising that numerous publications in this context

select Design Science Research (DSR) as their re-

search paradigm. DSR has become an established re-

search paradigm in the ﬁeld of information systems in

https://orcid.org/0009-0008-8992-5152

https://orcid.org/0009-0006-9735-0788

https://orcid.org/0009-0006-6061-1921

https://orcid.org/0000-0001-8097-286X

general and gained in popularity (Alturki et al., 2012).

A major cornerstone for this development was laid by

the authors Hevner et al. (2004), in which they po-

sition DSR as an alternative to traditional behavioral

science research (Samuel-Ojo et al., 2010; Gregor and

Hevner, 2013; Pascal and Renaud, 2020). The nu-

merous parallels to Data Science research lead Saar-

Tsechansky (2015) to conclude that the guidelines de-

ﬁned by Hevner et al. (2004) for conducting DSR re-

search also apply to publications in the ﬁeld of Data

Science Research. However, due to the novelty of

DSR in the ﬁeld of Data Science Research and the in-

terdisciplinary nature of this research area, there are

potential ambiguities in adapting the guidelines de-

ﬁned by Hevner et al. (2004). For instance, the guide-

lines refer to practical relevance. Further, Elragal and

Klischewski (2017) argue that Data Science publi-

cations often lack such practical relevance. Instead

of relevant problems usually use cases with a large

number of existing datasets are preferred. The rigor-

ousness of these Big Data datasets is questioned by

researchers as well, due to the unclear rigorous col-

lection of data as well as unknown hypotheses be-

hind (Elragal and Klischewski, 2017). In addition,

Big Data creates new needs in terms of evaluating ar-

tifacts and demonstrating knowledge gain and prob-

lem solving (Elragal and Haddara, 2019).

126

Szafarski, D., Schmieg, T., Welz, L. and SchÃd’ffer, T.

Rigor in Applied Data Science Research Based on DSR: A Literature Review.

DOI: 10.5220/0012122300003541

In Proceedings of the 12th International Conference on Data Science, Technology and Applications (DATA 2023), pages 126-135

ISBN: 978-989-758-664-4; ISSN: 2184-285X

 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)

Consequently, it is important to pay attention

to both the practical applicability and the scientiﬁc

grounding of the Data Science publications. Even be-

fore the multitude of ambiguities presented in con-

nection with Data Science, Arnott and Pervan (2012)

identiﬁed in a study that two-thirds of the publica-

tions show weak rigor in connection with the selec-

tion and use of research methods. It is currently un-

known whether, and if so, how, Data Science publi-

cations rigorously apply DSR to meet the demands of

a scientiﬁc community and the guidelines of Hevner

et al. (2004). Therefore, the following research ques-

tion (RQ) arises:

RQ: How is the rigor of DSR addressed in Data

Science research?

The aim of this literature review is to analyze Data

Science publications from the past three years claim-

ing the DSR approach and examining their rigor.

To achieve the objective, the following chapter

lays the necessary theoretical foundations. Building

upon this, the methodology as well as the selection

criteria for the research data will be explained in more

detail. Subsequently, the data will be analyzed on the

basis of the previously deﬁned schema in order to de-

rive statements about the rigor. The ﬁndings of this

research are summarized in a respective overview at

the end of this research work. In the end, after a short

summary of the results, the limitations and implica-

tions of this paper are pointed out.

2 THEORETICAL BACKGROUND

In this chapter, the terms ”Data Science Research”,

”DSR” and ”Rigor” are explicitly presented and ex-

plained to provide an overview of the theoretical foun-

dation of this paper.

2.1 Data Science Research

Data Science is an established interdisciplinary ﬁeld

that combines scientiﬁc methods, systems, and pro-

cesses from statistics, information science, and com-

puter science to gain insight into phenomena about

structured or unstructured data (Zhu and Xiong,

2015). According to van der Aalst (2016), Data Sci-

ence is used to assign business value to data. Ethical,

social, and legal aspects also play a role when consid-

ering the resulting insights (van der Aalst, 2016).

Nowadays, several process models can help re-

searchers to execute data science projects (Baijens

et al., 2020). Well-known reference models are the

CRoss Industry Standard Process for Data Mining

(CRISP-DM) as well as the Knowledge Discovery

in Databases (KDD). Based on them, several addi-

tional process models emerged over time (Mart

ınez-

Plumed et al., 2021). Taking CRISP-DM as an ex-

ample the typical process activities are arranged se-

quentially and consist of six different phases that can

optionally be iterated several times. These are (1)

Business Understanding, (2) Data Understanding, (3)

Data Preparation, (4) Modeling, (5) Evaluation, and

(6) Deployment (Chapman et al., 2000).

2.2 Design Science Research (DSR)

A well-known methodological framework in informa-

tion science that brings both practice and theory to

the solution approach is DSR (Hevner et al., 2004).

DSR is a problem-oriented approach that attempts

to gain a profound understanding of how novel so-

lutions, called artifacts, can be designed in the ﬁeld

of information science. An artifact reﬂects the re-

search object under consideration and can be repre-

sented as a construct, model, method, or instantiation.

The knowledge from the generated artifacts can be

used for future research or for practical implementa-

tion (Hevner et al., 2004). Knowledge about the de-

sign problem can be generated through constructing

and applying artifacts. In principle, the DSR frame-

work according to Hevner (2007) is presented as a

DSR cycle model with three different research cycles.

These are the relevance cycle, rigor cycle and design

cycle.

The cycles shown can also be abstracted as a se-

quential process that represents a single iteration stage

after each run. Peffers et al. (2006) divided the De-

sign Science Research Process Model into six distinct

phases: problem identiﬁcation, objective of solution,

design and development, demonstration, evaluation,

and communication. This process model also allows

iterating by going back to previous phases or starting

at different entry points (Peffers et al., 2006). Over

time several additional process models or advance-

ments of the presented models are made (Gregor and

Jones, 2007; Vaishnavi and Kuechler, 2004; Sonnen-

berg and Brocke, 2012).

2.3 Rigor

Primarily due to its practical orientation, the DSR

approach has been repeatedly criticized since its es-

tablishment for not meeting the demands of rigorous

research (Kuechler and Vaishnavi, 2011). ”In both

design-science and behavioral-science research, rigor

is derived from the effective use of the knowledge

base - theoretical foundations and research method-

Rigor in Applied Data Science Research Based on DSR: A Literature Review

127

ologies”.(p. 88) (Hevner et al., 2004) Thus, Hevner

et al. (2004) set up seven guidelines ((1) design as an

artifact, (2) problem relevance, (3) design evaluation,

(4) research contribution, (5) research rigor, (6) de-

sign as a search process, and (7) communication of re-

search) for constructing and applying artifacts. These

guidelines are based on the fundamental principle that

DSR is a problem-solving process. Within Hevner

(2007) DSR cycle model, rigor is ensured in the form

of a separate cycle. According to Peffers et al. (2007),

rigor must be ensured in both development and evalu-

ation. This can be ensured through the thoughtful se-

lection and application of existing research methods

and reference models (Hevner et al., 2004).

While Arnott and Pervan (2012) subdivided rigor

into the theoretical foundation and the selection and

use of appropriate research methods. This paper fo-

cuses mainly on applied research methods. The rea-

son for this focus is that the original study (Arnott

and Pervan, 2012) already identiﬁed a high level of

theoretical rigor but a low level of rigor in relation to

research methods. Several publications present possi-

ble methods that are well suited for DSR (Sonnenberg

and Brocke, 2012; Peffers et al., 2012).

3 METHODOLOGY

Fig. 1 illustrates our research approach. In the ﬁrst

step, we conducted a literature review to identify rel-

evant publications. For this, we followed the guide-

lines of vom Brocke et al. (2009) to meet the require-

ments for rigor in this work. Initially, the criteria for

the selection of suitable literature are determined. The

selection ﬁrst focused on the three literature databases

IEEE, ScienceDirect, and SpringerLink. Using the

keyword search, relevant publications were searched

with the following search string. “Data Science”

AND (“Design Science*” OR “DSR”).

Applying this, 82 publications are identiﬁed and

examined for duplicates. Their suitability in terms of

content is assessed on the basis of the abstracts and

the respective methodological sections. Here, a publi-

cation is only classiﬁed as suitable if it actively states

that it proceeds according to DSR. In addition, data

science has to play at least a thematic role. Accord-

ingly, the number of relevant publications is reduced

to a ﬁnal number of 62, which are subsequently ana-

lyzed in depth.

The second step is to analyze the characteristics

of publications. Therefore, qualitative content anal-

ysis according to Mayring (2000) is used. The pub-

lications are systematically analyzed on the basis of

previously developed coding guidelines. Through an

inductive category development process, two new di-

mensions are identiﬁed. These are extended by two

further dimensions that are deductively derived

1. The use of research process models. Both in the

area of DSR and Data Science. In the context of

the analysis, only those sources were taken into

account where the authors state that they follow a

DSR approach.

2. The role of Data Science in the publications. Here

it is inductively identiﬁed on the basis of the liter-

ature examined that Data Science

(a) appears as an enabler and is only considered

from a meta-perspective. No implementation or

detailed explanation of Data Science concepts

follows.

(b) appears as an demonstrator within the frame-

work. For example, as a prototype in the

demonstration while the actual main artifact is

a framework.

component of the publication. Here, for exam-

ple, a prediction model is developed for a prob-

lem and its performance is subsequently vali-

dated.

3. The type of artifact developed according to the

classiﬁcation of Hevner et al. (2004).

4. The research methods used or cited are classiﬁed

according to Sonnenberg and Brocke (2012).

If such a concept is identiﬁed in a publication, this

is indicated by an X in the concept matrix. A ﬁner

distinction is made for the research methods in order

to assess the rigor of Data Science Research as men-

tioned in section 2.3. If the use of a research method is

only mentioned, a blank circle is ﬁlled in. If a suitable

publication is referenced and justiﬁed, a full circle is

used as a symbol instead.

In the third step, an assessment based on the pub-

lication by Arnott and Pervan (2012) is made. For

this purpose, the classiﬁcation is broken down into

the areas Weak, Adequate, and Strong. As the orig-

inal paper did not provide a comprehensive deﬁnition

of the assessment categories, the following guidelines

are made for the purpose of this paper.

• Weak: The rigor of a publication is classiﬁed

as Weak if it (1) does not mention any research

methods (2) mentions research methods but does

not give any references and justiﬁcations for the

choice of methods or (3) appropriately references

and justiﬁes less than 50 % of the chosen research

methods.

• Adequate: The rigor of a publication is classi-

ﬁed as Adequate if (1) it uses at least one research

DATA 2023 - 12th International Conference on Data Science, Technology and Applications

128

Figure 1: Overview of the Research Approach.

method and (2) justiﬁes and supports the choice

of research methods with appropriate references

in 50 % or more of the cases.

• Strong: The rigor of a publication is classiﬁed

as Strong if it justiﬁes and references each chosen

research method, in contrast to Adequate.

According to Peffers et al. (2012), a four- or multi-

eye principle is applied. In this way, an appropriate

and consistent assessment is attempted. In a fourth

step, the identiﬁed dimensions are presented in a con-

cept matrix according to the approach of Webster and

Watson (2002), based on the presented assessment

rules. The ﬁndings can be derived from the matrix

and will be presented in the following chapter.

4 FINDINGS

The rigor assessment of the 62 publications accord-

ing to the procedure already described yields differ-

ent ﬁndings. These are mapped in the concept matrix

below and then subsequently explained. For this pur-

pose, this section is subdivided according to dimen-

sions in the concept matrix.

4.1 Assessment

Looking at the assessment of the entire sample of this

study, three (4.8 %) publications are in the category

strong, eleven publications (17.7 %) are adequate and

48 (77.4 %) are weak. According to the study con-

ducted by Arnott and Pervan (2012) about 10 years

ago, 3.3 % were Strong, 22.1 % were Adequate, and

74.6 % were Weak. Thus, it can be seen that within

this sample the ratio of publications seems to remain

the same. In the context of data science publications,

therefore, the weak rigor with regard to the selec-

tion, justiﬁcation, and citation of appropriate research

methods continues to be clearly evident.

4.2 Research Methods

During the analysis numerous research methods could

be identiﬁed. A detailed overview of all methods as

well as which publications use speciﬁc methods can

be found in the concept matrix 1. Upon closer exam-

ination of the research methods used, the most com-

mon one is literature analysis followed by expert in-

terviews. Although they are used in 33 and 24 cases

respectively, the concrete application is seldom ex-

plained and supported by research method references.

A distinct portion of the sample did not mention

any applied research method in the context of DSR.

In eight cases, it was not possible to observe any re-

search method. Consequently, these cases completely

omit the mention of adequate research methods. They

limit themselves to the naming of DSR in general. In

addition to this, eight further cases only mentioned

a single research method for their research approach.

In both of those cases, this automatically resulted in

classiﬁcation as weak.

There are publications that merely mention the use

of research methods without referencing them. This

applies to 41 cases (66.1 %) after classiﬁcation. This

percentage does not allow for a general conclusion

on the mention of methods since publications in this

group also mention using more than four methods.

Rigor in Applied Data Science Research Based on DSR: A Literature Review

129

Table 1: Overview of Publication Characteristics and Assessment Results.

DATA 2023 - 12th International Conference on Data Science, Technology and Applications

130

Nevertheless, according to the rules of rigor deﬁned

before, all these publications are rated as weak. The

delta of these cases to all weak cases is represented by

publications that make references in less than half of

the cases.

4.3 IT-Artifact and Role of Data Science

A dedicated examination of rigor considering the

Data Science role within this publication did not re-

veal any relation. The same applies to the differenti-

ation per artifact type. However, independent of the

research question, the distribution of the individual

classes is nevertheless interesting.

4.4 Process Model

Since the sample was selected based on the use of

DSR, it is not surprising that each publication men-

tions a process model. A corresponding reference is

always given as well. Overall, a variety of sources

is used. Sometimes publications referenced several

sources for the used methodology. However, when

considering data science process models, only three

models mention CRISP-DM. In this sample other

Data Science process models are not applied. The ex-

tended concept matrix showing all process models is

available from the authors upon request.

5 DISCUSSION

The study reveals signiﬁcant weaknesses in connec-

tion with the selection and citation of suitable re-

search methods. As a result, over 77 % of publica-

tions are rated as weak in regard to the rigorous se-

lection and application of research methods. When

publications lack rigor, it can lead to results that are

not replicable by other researchers, unreliable, or even

incorrect. This can lead to incorrect conclusions be-

ing drawn which can have implications for further re-

search and practice. Although each publication spec-

iﬁes a DSR methodology and provides evidence of

it through widely used references, this still does not

infer a fundamental rigor in the approach. Peffers

et al. (2007), indicated by 29 publications (46.7 %),

also conﬁrmed that the citation of research methods

contributes to better understanding, increased validity

and generalizability of the results. A possible expla-

nation might be that the researchers work conscien-

tiously and structured in the background, but do not

document it in the paper. Moreover, researchers may

be unaware of the actual use of the research methods

(e.g. experiment, survey, case study) and the need to

explicitly validate the results from them.

Furthermore, it can be seen in the concept matrix

that only in three cases name a Data Science process

model. The referenced CRISP-DM, which is popular

in knowledge and practice, has fundamental similar-

ities to DSR. Both approaches focus on the develop-

ment and validation of solutions to practical problems

using similar phases and methods. This could be a

reason for the low number of Data Science process

models while the number of DSR models is high.

Especially in publications in which the role of

Data Science is a demonstrator or artifact, metrics

such as accuracy are often used. One explanation for

the omission of speciﬁc justiﬁcations and references

could be that for many metrics in the ﬁeld of Data Sci-

ence no foundation papers are available. Furthermore,

authors could assume that the methods used are gen-

erally known and therefore refrain from citing them.

Moreover, mathematical constructs such as an ac-

curacy score are often used for evaluation in the ﬁeld

of Data Science. While Sonnenberg and Brocke

(2012) argue that logical reasoning and mathemati-

cal proofs can also be a method for evaluating, this

was not considered as a research method itself in this

work. These metrics usually have a practical origin

and are generally known in the community. Neverthe-

less, a reference to the work that justiﬁes the selection

and describes the metrics is usually omitted. Further-

more, the results still can be considered valid, because

this does not excuse the absence of referencing cho-

sen research methods or the absence of research meth-

ods in other phases of DSR at all. In addition, these

metrics are only used for quantitative evaluations. A

qualitative perspective is therefore completely miss-

ing if no other evaluation method is conducted. For

this reason, the procedure can generally be regarded

as well-suited to answer the research question.

6 CONCLUSION

Based on the ﬁndings of Arnott and Pervan (2012),

many publications have a shortcoming in rigorously

describing the selection, application, and justiﬁcation

of appropriate research methods. This motivates this

study to re-validate these ﬁndings after ten years. For

this purpose, the focus lies on the increasingly popu-

lar ﬁeld of Data Science. Other publications have al-

ready shown that DSR is a suitable methodology for

Data Science publications. According to the search

criteria, 62 publications in the investigated time pe-

riod were based on a DSR approach.

Rigor in Applied Data Science Research Based on DSR: A Literature Review

131

6.1 Summary

This paper examined the rigorous use of research

methods and systematically maps them through a con-

cept matrix 1 to answer the research question. Based

on the research methods mentioned in each case, as

well as on the corresponding references provided, an

assessment of rigor is made. In summary, 77.4 % of

these publications were categorized as weak in terms

of rigor. Consequently, the ratio remains nearly simi-

lar to the data from the 2012 study of Arnott and Per-

van (2012). Looking at the analysis of Data Science

Research, 66.1 % do not reference or justify the re-

search methods with appropriate literature references

at all. At the same time, eight publications com-

pletely abstained from mentioning further research

methods. Only three publications could be identiﬁed

as strongly rigorous. The main reason for this assess-

ment is the consistent use of research methods for all

activities within the various phases of DSR. In each

case, references are cited that prove that it is an es-

tablished research method whose success can be ex-

pected. Hereby it can be concluded that after using

the methods in a comprehensible and promising way

in many contexts, the results are also valid for this

publication.

6.2 Limitations

We need to acknowledge some limitations to our re-

search. First, it must be taken into account that during

the data collection and classiﬁcation, despite main-

taining a dual control principle, the absence of errors

cannot be guaranteed. Due to the diversity of this re-

search area and the missing separation precision pro-

vided by the search string used, it cannot be guaran-

teed that all publications matching the topic could be

determined. In Addition, the differentiation by con-

cepts and role of Data Science did not lead to any ﬁnd-

ings due to the small sub-sample size in some cases.

In addition, the rigor applied by researchers can only

be assessed based on what was written in the publica-

tions. This means that only explicitly mentioned re-

search methods can be considered. Researchers may

omit (intentionally) details or steps for different rea-

sons. It is important to acknowledge the limitations in

assessing the rigor based solely on the written docu-

mentation in form of publications.

6.3 Future Work

To validate this work as well as for further analysis, a

replication of the study with additional Data Science

publications, even if they may not explicitly state that

they proceed according to DSR, would be useful. In

future studies, an exploration of DSR in the domain

of Data Science may beneﬁt from adopting a quali-

tative methodology. By analyzing Data Science re-

search more in-depth the rigor can be assessed beyond

the applied research methods described in the corre-

sponding publications. Another possibility for further

work would be to develop a DSR process model, tai-

lored to the speciﬁc needs of Data Science research.

In order to take into account the characteristics of

Data Science Research, the authors propose future

work about the integration of a Data Science process

model like CRISP-DM into DSR.

REFERENCES

Abbasi, A., Sarker, S., and Chiang, R. H. (2016). Big data

research in information systems: Toward an inclusive

research agenda. JAIS, 17:1–32.

Abdel-Karim, B. M., Pfeuffer, N., and Hinz, O. (2021).

Machine learning in information systems - a biblio-

graphic review and open research issues. Electronic

Markets, 31:643–670.

Abkenar, S. P., Vanani, I. R., Sohrabi, B., and Manian, A.

(2022). Social commerce mobile application enhance-

ment: a hybrid text clustering - topic modeling busi-

ness model analysis. Electronic Commerce Research.

Alturki, A., Bandara, W., and Gable, G. G. (2012). Design

science research and the core of information systems.

LNCS, 7286:309–327.

Andrews, R., van Dun, C. G., Wynn, M. T., Kratsch, W.,

oglinger, M. K., and ter Hofstede, A. H. (2020).

Quality-informed semi-automated event log genera-

tion for process mining. Decision Support Systems,

132.

Arnott, D. and Pervan, G. (2012). Design science in deci-

sion support systems research: An assessment using

the hevner, march, park, and ram guidelines. JAIS,

13:923–949.

Assarandarban, M., Bhowmik, T., Do, A. Q., Chekuri, S.,

Wang, W., and Niu, N. (2021). Foraging-theoretic tool

composition: An empirical study on vulnerability dis-

covery. Proceedings of IRI 2021, pages 139–146.

Baijens, J., Helms, R., and Iren, D. (2020). Applying scrum

in data science projects. Proceedings of CBI 2020,

1:30–38.

Benbasat, I. and Zmud, R. W. (2003). The identity crisis

within the is discipline: Deﬁning and communicating

the discipline’s core properties. MIS Quarter, 27:183–

194.

Bokolo, A. J. (2022). Exploring interoperability of dis-

tributed ledger and decentralized technology adoption

in virtual enterprises. ISeB.

Brunk, J., Stierle, M., Papke, L., Revoredo, K., Matzner,

M., and Becker, J. (2021). Cause vs. effect in context-

sensitive prediction of business process instances. In-

formation Systems, 95.

DATA 2023 - 12th International Conference on Data Science, Technology and Applications

132

Bunnell, L., Osei-Bryson, K. M., and Yoon, V. Y. (2020).

Finpathlight: Framework for an multiagent recom-

mender system designed to increase consumer ﬁnan-

cial capability. Decision Support Systems, 134.

Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz,

T. P., Shearer, C., and Wirth, R. (2000). Crisp-dm 1.0:

Step-by-step data mining guide.

Chasupa, T. L. and Paireekreng, W. (2021). The framework

of extracting unstructured usage for big data platform.

pages 90–94. IEEE.

Clapham, B., Bender, M., Lausen, J., and Gomber, P.

(2022). Policy making in the ﬁnancial industry: A

framework for regulatory impact analysis using tex-

tual analysis. Journal of Business Economics.

Elragal, A. and Haddara, M. (2019). Design science re-

search: Evaluation in the lens of big data analytics.

Systems, 7:1–8.

Elragal, A. and Klischewski, R. (2017). Theory-driven

or process-driven prediction? epistemological chal-

lenges of big data analytics. Journal of Big Data, 4:1–

20.

Fabian, B., Bender, B., Hesseldieck, B., Haupt, J., and Less-

mann, S. (2021). Enterprise-grade protection against

e-mail tracking. Information Systems, 97.

Fatima, S., Desouza, K. C., Buck, C., and Fielt, E. (2022).

Public ai canvas for ai-enabled public value: A design

science approach. GIQ, 39.

Feio, I. C. L. and dos Santos, V. D. (2022). A strategic

model and framework for intelligent process automa-

tion. volume 2022-June, pages 22–25.

Ferro, D. B., Brailsford, S., Bravo, C., and Smith, H. (2020).

Improving healthcare access management by predict-

ing patient no-show behaviour. Decision Support Sys-

tems, 138.

Filipiak, D., We¸cel, K., Str

zyna, M., Michalak, M., and

Abramowicz, W. (2020). Extracting maritime trafﬁc

networks from ais data using evolutionary algorithm.

BISE, 62:435–450.

Fischer, D. A., Goel, K., Andrews, R., van Dun, C.

G. J., Wynn, M. T., and R

oglinger, M. (2020). En-

hancing event log quality: Detecting and quantify-

ing timestamp imperfections. volume 12168 LNCS,

pages 309–326. Springer Science and Business Media

Deutschland GmbH.

Fussl, A. and Nissen, V. (2022). Interpretability of knowl-

edge graph-based explainable process analysis. pages

9–17.

orster, M., Bansemir, B., and Roth, A. (2021). Innova-

tion von datengetriebenen gesch

aftsmodellen. HMD

Praxis der Wirtschaftsinformatik, 58:595–610.

Goes, P. B. (2014). Big data and is research [editor’s com-

ments]. MIS Quarterly, 38:iii–viii.

Graafmans, T., Turetken, O., Poppelaars, H., and Fahland,

D. (2021). Process mining for six sigma: A guideline

and tool support. BISE, 63:277–300.

Gregor, S. and Hevner, A. R. (2013). Positioning and pre-

senting design science research for maximum impact.

MIS Quarterly: Management Information Systems,

37:337–355.

Gregor, S. and Jones, D. (2007). The anatomy of a design

theory. JAIS, 8:312–335.

ortowski, S. and Lewa

nska, E. (2020). Incremental mod-

eling method of supply chain for decision-making

support. volume 394, pages 34–44. Springer Science

and Business Media Deutschland GmbH.

Hacker, J. and Riemer, K. (2021). Identiﬁcation of user

roles in enterprise social networks: Method develop-

ment and application. BISE, 63:367–387.

Hevner, A. (2007). A three cycle view of design science

research. SJIS, 19:87–92.

Hevner, A. R., March, S. T., Park, J., and Ram, S. (2004).

Design science in information systems research. MIS

Quarterly: Management Information Systems, 28:75–

105.

Huang, G. and Buss, M. (2020). Decision-making in ma-

chine learning. Proceedings of AEECA 2020, pages

256–260.

Hunke, F., Seebacher, S., and Thomsen, H. (2020). Please

tell me what to do – towards a guided orchestration

of key activities in data-rich service systems. volume

12388 LNCS, pages 426–437. Springer Science and

Business Media Deutschland GmbH.

Johnson, M., Albizri, A., Harfouche, A., and Fosso-Wamba,

S. (2022). Integrating human knowledge into artiﬁcial

intelligence for complex and ill-structured problems:

Informed artiﬁcial intelligence. IJIM, 64.

Jordan, M. I. and Mitchell, T. M. (2015). Machine learn-

ing: Trends, perspectives, and prospects. Science,

349:255–260.

Joubert, A., Murawski, M., and Bick, M. (2021). Measuring

the big data readiness of developing countries – index

development and its application to africa. Information

Systems Frontiers.

Kaymakci, C., Wenninger, S., and Sauer, A. (2021). A

holistic framework for ai systems in industrial appli-

cations. volume 47, pages 78–93. Springer Science

and Business Media Deutschland GmbH.

Khalil, M. and Rambech, M. (2022). Eduino: A telegram

learning-based platform and chatbot in higher educa-

tion. volume 13329 LNCS, pages 188–204. Springer

Science and Business Media Deutschland GmbH.

Kortmann, F., Fassmeyer, P., Funk, B., and Drews, P.

(2022). Watch out, pothole! featuring road dam-

age detection in an end-to-end system for autonomous

driving. DKE, 142.

Krasikov, P., Legner, C., and Eurich, M. (2021). Sourc-

ing the right open data: A design science research

approach for the enterprise context. volume 12807

LNCS, pages 313–327. Springer Science and Busi-

ness Media Deutschland GmbH.

Kratsch, W., K

onig, F., and R

oglinger, M. (2022). Shed-

ding light on blind spots – developing a reference ar-

chitecture to leverage video data for process mining.

Decision Support Systems, 158.

Kregel, I., Stemann, D., Koch, J., and Coners, A. (2021).

Process mining for six sigma: Utilising digital traces.

CAIE, 153.

Rigor in Applied Data Science Research Based on DSR: A Literature Review

133

Kuechler, B. and Vaishnavi, V. (2011). Promoting relevance

in is research: An informing system for design science

research. InformingSciJ, 14:125–138.

ohr, B., Brennig, K., Bartelheimer, C., Beverungen, D.,

and M

uller, O. (2022). Process mining of knowledge-

intensive processes: An action design research study

in manufacturing. pages 251–267.

Maheswaran, S., Indhumathi, N., Dhanalakshmi, S., Nan-

dita, S., Shaﬁq, I. M., and Rithka, P. (2022). Iden-

tiﬁcation and classiﬁcation of groundnut leaf disease

using convolutional neural network. pages 251–270.

Mar-Raave, J. R. D., Bahs¸i, H., Mr

c, L., and Hausknecht,

K. (2021). A machine learning-based forensic tool

for image classiﬁcation - a design science approach.

Forensic Science International: Digital Investigation,

38.

Martin, D., K

uhl, N., von Bischhoffshausen, J. K., and

Satzger, G. (2020). System-wide learning in cyber-

physical service systems: A research agenda. volume

12388 LNCS, pages 457–468. Springer Science and

Business Media Deutschland GmbH.

Mart

ınez-Plumed, F., Contreras-Ochando, L., Ferri, C.,

Hern

andez-Orallo, J., Kull, M., Lachiche, N.,

Ram

ırez-Quintana, M. J., and Flach, P. (2021). Crisp-

dm twenty years later: From data mining processes

to data science trajectories. IEEE Transactions on

Knowledge and Data Engineering, 33(8):3048–3061.

Mayring, P. (2000). Pensionierung als krise oder

ucksgewinn?–ergebnisse aus einer quantitativ-

qualitativen l

angsschnittuntersuchung. Zeitschrift f

Gerontologie und Geriatrie, 33:124–133.

Miah, S. J., Solomonides, I., and Gammack, J. G. (2020). A

design-based research approach for developing data-

focussed business curricula. Education and Informa-

tion Technologies, 25:553–581.

Molnar, E., Molnar, R., and Gregus, M. (2020). Business

schools and ris3 – enterprise architecture perspective.

Mombini, H., Tulu, B., Strong, D., Agu, E., Nguyen, H.,

Lindsay, C., Loretz, L., Pedersen, P., and Dunn, R.

(2020). Design of a machine learning system for pre-

diction of chronic wound management decisions. vol-

ume 12388 LNCS, pages 15–27. Springer Science and

Business Media Deutschland GmbH.

Monteiro, J., Barata, J., Veloso, M., Veloso, L., and Nunes,

J. (2022). A scalable digital twin for vertical farming.

JAIHC.

Odu, N. B., Prasad, R., Onime, C., and Sharma, B. K.

(2022). How to implement a decision support for dig-

ital health: Insights from design science perspective

for action research in tuberculosis detection. IJIM, 2.

Pan, Y. and Zhang, L. (2021). A bim-data mining integrated

digital twin framework for advanced project manage-

ment. Automation in Construction, 124.

Panzner, M., von Enzberg, S., Meyer, M., and Dumitrescu,

R. (2022). Characterization of usage data with the

help of data classiﬁcations. Journal of the Knowledge

Economy.

Pascal, A. and Renaud, A. (2020). 15 years of informa-

tion systems design science research: A bibliographic

analysis. Proceedings of HICSS 2020, pages 5016–

5025.

Peffers, K., Rothenberger, M., Tuunanen, T., and Vaezi, R.

(2012). Design science research evaluation. volume

7286, pages 398–410.

Peffers, K., Tuunanen, T., Gengler, C. E., Rossi, M., Hui,

W., Virtanen, V., and Bragge, J. (2006). The design

science research process: A model for producing and

presenting information systems research. pages 83–

106.

Peffers, K., Tuunanen, T., Rothenberger, M. A., and Chat-

terjee, S. (2007). A design science research method-

ology for information systems research. JMIS, 24:45–

77.

Peixoto, D., Faria, M., Macedo, R., Peixoto, H., Lopes,

J., Barbosa, A., Guimar

aes, T., and Santos, M. F.

(2022). Determining internal medicine length of stay

by means of predictive analytics. volume 13566

LNAI, pages 171–182.

Pohl, M., Staegemann, D. G., and Turowski, K. (2022).

The performance beneﬁt of data analytics applica-

tions. volume 201, pages 679–683. Elsevier B.V.

Romanov, D., Molokanov, V., Kazantsev, N., and Jha,

A. K. (2022). Removing order effects from human-

classiﬁed datasets: A machine learning method to im-

prove decision making systems. Decision Support

Systems.

Saar-Tsechansky, M. (2015). The business of business data

science in is journals. MIS Quarterly: Management

Information Systems, 39:iii–vi.

Samuel-Ojo, O., Shimabukuro, D., Chatterjee, S., Muthui,

M., Babineau, T., Prasertsilp, P., Ewais, S., and

Young, M. (2010). Meta-analysis of design science

research within the is community: Trends, patterns,

and outcomes. Global Perspectives on Design Science

Research, pages 124–138.

Schuster, T., Waidelich, L., and Volz, R. (2021). Matu-

rity models for the assessment of artiﬁcial intelligence

in small and medium-sized enterprises. volume 429

LNBIP, pages 22–36. Springer Science and Business

Media Deutschland GmbH.

Schwade, F. (2021). Social collaboration analytics frame-

work: A framework for providing business intelli-

gence on collaboration in the digital workplace. Deci-

sion Support Systems, 148.

Schwaiger, J., Hammerl, T., Florian, J., and Leist, S. (2021).

Ur: Smart–a tool for analyzing social media content.

ISeB, 19:1275–1320.

Schweiger, L., Barth, L., and Meierhofer, J. (2020). Data

resources to create digital twins. Proceedings of SDS

2020, pages 55–56.

Simonofski, A., Fink, J., and Burnay, C. (2021). Supporting

policy-making with social media and e-participation

platforms data: A policy analytics framework. GIQ,

38.

Soares, C. M. (2022). Crowdsourced data to improve mu-

nicipalities governance : Sesimbra’s case. pages 22–

25.

Sonnenberg, C. and Brocke, J. V. (2012). Lncs 7286 - eval-

DATA 2023 - 12th International Conference on Data Science, Technology and Applications

134

uations in the science of the artiﬁcial – reconsidering

the build-evaluate pattern in design science research.

Spanaki, K., Karaﬁli, E., and Despoudi, S. (2021). Ai ap-

plications of data sharing in agriculture 4.0: A frame-

work for role-based data access control. IJIM, 59.

Truong, T. M., L

e, L. S., Paja, E., and Giorgini, P. (2021).

A data-driven, goal-oriented framework for process-

focused enterprise re-engineering. ISeB, 19:683–747.

Unhelkar, B. and Askren, J. (2020). Development and initial

validation of the big data framework for agile busi-

ness: Transformational innovation initiative. LNNS,

69:952–960.

Vaishnavi, V. and Kuechler, W. (2004). Design research in

information systems, volume 1.

van der Aalst, W. (2016). Data science in action.

van Dun, C., Moder, L., Kratsch, W., and R

oglinger, M.

(2022). Processgan: Supporting the creation of busi-

ness process improvement ideas through generative

machine learning. Decision Support Systems.

Venkata, H. B. A., Calazacon, A., Mahmoud, T., and Hanne,

T. (2022). A technology recommender system based

on web crawling and natural language processing.

Proceedings of AIC 2022, pages 623–631.

Vereno, D., Polanec, K., and Neureiter, C. (2022). Evalu-

ating and improving model-based assessment of con-

textual data quality in smart grids. ISGT NA.

Volk, M., Staegemann, D., Trifonova, I., Bosse, S., and Tur-

owski, K. (2020). Identifying similarities of big data

projects-a use case driven approach. IEEE Access,

8:186599–186619.

vom Brocke, J., Simons, A., Niehaves, B., Niehaves, B.,

Reimer, K., Plattfaut, R., and Cleven, A. (2009). Re-

constructing the giant: On the importance of rigour in

documenting the literature search process.

ossing, M., K

uhl, N., Lind, M., and Satzger, G. (2022).

Designing transparency for effective human-ai collab-

oration. Information Systems Frontiers.

Warsinsky, S., Schmidt-Kraepelin, M., Thiebes, S., Wagner,

M., and Sunyaev, A. (2022). Gamiﬁed expert annota-

tion systems: Meta-requirements and tentative design.

volume 13229 LNCS, pages 154–166. Springer Sci-

ence and Business Media Deutschland GmbH.

Webster, J. and Watson, R. T. (2002). Analyzing the past

to prepare for the future: Writing a literature review.

MIS Quarterly, 26:xiii–xxiii.

Xia, L. (2022). Historical proﬁle will tell? a deep learning-

based multi-level embedding framework for adverse

drug event detection and extraction. Decision Support

Systems, 160.

Yang, K. L., Hsu, S. C., and Hsu, H. M. (2020). Enriching

design thinking with data science: Using the taiwan

moving industry as a case. volume 1189 CCIS, pages

185–202. Springer.

Zhu, Y. and Xiong, Y. (2015). Deﬁning data science. CoRR,

abs/1501.0:1–6.

Zschech, P., Horn, R., H

oschele, D., Janiesch, C., and Hein-

rich, K. (2020). Intelligent user assistance for auto-

mated data mining method selection. BISE, 62:227–

247.

Rigor in Applied Data Science Research Based on DSR: A Literature Review

135