2021). Especially the data collection, data prepara-
tion, training, monitoring as well as the deployment
process of a typical ML life cycle require novel soft-
ware engineering practices in comparison to tradi-
tional software engineering (Sculley et al., 2015; Ser-
ban et al., 2020). There is a persisting discrepancy
between the engineering of ML-capable systems and
the engineering of traditional software (Giray, 2021).
To bridge this gap, the traditional software engineer-
ing ways of implementing code need to be revisited
due to the uniqueness of non-deterministic ML sys-
tems.
To address these issues, a two-day long discus-
sion of 160 practitioners and researchers on the chal-
lenges and implications of engineering ML systems
at the First Symposium on Software Engineering for
Machine Learning Applications (SEMLA) (Khomh
et al., 2018) spawned two key questions:
• ”How should software development teams inte-
grate the AI model life cycle (training, testing, de-
ploying, evolving, and so on) into their software
process?”
• ”What new roles, artifacts, and activities come
into play, and how do they tie into existing agile
or DevOps processes?”
According to SEMLA, researching these two key
questions is essential to link software engineering and
ML development processes. Currently, a growing cor-
pus of academic literature is concerned with these
problems. Chandrasekaran et al. (2021), for exam-
ple, propose in their work on ML governance that
an operational life cycle consists of data preparation,
model development, and a model deployment phase.
Accordingly, they defined principals, the involvement
and interaction of these principals, and the life cycle
management of ML systems (Chandrasekaran et al.,
2021). Furthermore, Ritz et al. (2022) defined a com-
prehensive process model for ML systems to capture
the dependencies between the artifacts and activities
in a ML life cycle in order to bridge the gap be-
tween existing software engineering process models
and ML-specific procedures (Ritz et al., 2022). Also,
there has been a growing interest in defining princi-
ples, components, roles, and architectures in the con-
text of operationalizing MLOps workflows (Ruf et al.,
2021; Subramanya et al., 2022; Kreuzberger et al.,
2022).
Even though the key questions posed by SEMLA
have been addressed by the academic literature,
FedML introduces new processes and requirements
due to its decentral model-to-data approach. Due to
the decentralization and local training/usage proce-
dure, additional roles, activities, artifacts, and life cy-
cle stages need to be introduced. Before FedML can
be broadly operationalized, we argue that these spe-
cific questions posed by SEMLA need to be answered
as well.
IEEE published a Guide for Architectural Frame-
work and Application of Federated Machine Learn-
ing (IEEE, 2021), which defines a standard for the
FedML reference architecture including user role de-
scriptions of the FedML process. According to this
IEEE standard, a participant could play the role of
a data owner, model user, coordinator, and/or audi-
tor. These roles are presented with their accompany-
ing actions in the FedML process. This reference ar-
chitecture provides generalized information as a tem-
plate solution for the implementation of FedML pro-
cesses including structures and respective elements
with their relation. The reference architecture can
be used as a basic foundation for the governance of
FedML projects. However, according to this refer-
ence architecture, a single role is responsible for mul-
tiple steps of a typical MLOps life cycle. For instance,
the activities of a coordinator comprise the aggrega-
tion step, model management, data management, de-
ployment, and capabilities coordination.
We argue that the separation of roles and activities
should be closely aligned with established MLOps
stages and principles, such that the FedML specifics
can be easily integrated into known MLOps work-
flows. Hence, a more structured breakdown of the
activities and roles in relation to the different stages
of an End-to-End FedML life cycle is needed to fully
capture the dependencies and interactions between
these roles. Furthermore, defining the set of actors,
their roles and activities not only provides a clearer
understanding of the project’s setup, but also plays a
crucial part in the governance of the project (de Man
and Luvison, 2019; Kujala et al., 2021; Carid
`
a et al.,
2018). To accomplish this, we aim to identify the dif-
ferences between MLOps and FedML-specifics to fi-
nally derive a formal process model for a full End-
to-End FedML life cycle. Through a holistic process
model, we hope to enable practitioners to set up and
provide a foundation for the governance of FedML
projects.
In summary, the planned final contributions of our
ongoing research would comprise:
• Role Model: To structure the individual roles in-
cluding their corresponding capabilities and re-
sponsibilities.
• Activity Model: To understand the involvement,
operations, and activities of each role.
• Artifact Model: To show how artifacts are used
and structured in the process flow.
• Process Model: To capture the different life cy-
ICEIS 2023 - 25th International Conference on Enterprise Information Systems
526