Authors:
Tobias Müller
1
;
2
;
Milena Zahn
1
;
2
and
Florian Matthes
1
Affiliations:
1
Technical University of Munich, TUM School of Computation, Information and Technology, Department of Computer Science, Boltzmannstrasse 3, Garching bei München, Germany
;
2
SAP SE, Dietmar-Hopp-Allee 16, Walldorf, Germany
Keyword(s):
Federated Machine Learning (FedML), Software Engineering, Machine Learning Operations (MLOps).
Abstract:
Federated Machine Learning is a promising approach for training machine learning models on decentralized data without the need for data centralization. Through a model-to-data approach, Federated Machine Learning yields huge potential from privacy by design to heavily reducing communication costs and offline usage. However, the implementation and management of Federated Machine Learning projects can be challenging, as it involves coordinating multiple parties across different stages of the project life cycle. We observed that Federated Machine Learning is missing clarity over the individual involved roles including their activities, interactions, dependencies, and responsibilities which are needed to establish governance and help practitioners operationalize Federated Machine Learning projects. We argue that a process model, which is closely aligned with established MLOps principles can provide this clarification. In this position paper, we make a case for the necessity of a role mod
el to structure distinct roles, an activity model to understand the involvement and operations of each role, and an artifact model to demonstrate how artifacts are used and structured. Additionally, we argue, that a process model is needed to capture the dependencies and interactions between the roles, activities, and artifacts across the different stages of the life cycle. Furthermore, we describe our research approach and the current status of our ongoing research toward this goal. We believe that our proposed process model will provide a foundation for the governance of Federated Machine Learning projects, and enable practitioners to leverage the benefits of decentralized data computation.
(More)