formance monitoring. However, the contents are not
further detailed, nor reflected in the process model.
A more recent approach is the Team Data Science
Process (TDSP) of Microsoft (Ericson et al., 2017).
This process also includes a deployment step in which
a pipeline is built for the update/upgrade of models.
In comparison to our process, the update and upgrade
loops are not depicted as loops in the process model.
Instead, these are covered in pipeline development.
The most similiar approach that inspired our pro-
cess model is the lifecycle of an analytical model
provided by (Grossman, 2018; Pivarski et al., 2016).
They provide a phase of analytical modeling and an
analytic operations phase in the development and op-
eration of machine learning models. For the up-
date and upgrade of models they describe a cham-
pion/challenger strategy. This strategy is covered by
a patent of (Chengwen, 2012). It contains a paral-
lel process step, which includes building other mod-
els versions, while the current version of the model is
in use. If the new version of the model is more accu-
rate, it will replace the old one. However, the cham-
pion/challenger strategy is not a particular strategy for
triggering the update of a model. Triggering updates
can be achieved through, e.g., manual or periodic re-
training and adaptive machine learning algorithms to
react to concept drifts. The final step is to retire the
model and to re-deploy an improved one. In our pro-
cess, we consider retiring the model as a final activity
when the model is taken out of service. We consider a
model to consist of multiple model versions that rep-
resent the evolution of the model over time. These
model versions are generated through our introduced
update and upgrade loops.
4 PROCESS MODEL
In this section, we introduce our main contribution, a
new process model for comprehensive machine learn-
ing model management. Loops need to be added to
the process in order to enable reaction on changes in
either the model or in the context the model is applied
to. The process is divided into two phases: (i) the ex-
perimental phase, in which models are planned, built,
and tested, and (ii) the operational phase, in which
models are used, monitored, and, if necessary, re-built
or retired. This is depicted in Figure 1. In a first
step, we conducted a thorough analysis in the smart
manufacturing domain depicted by the motivational
scenario (see Section 2), in order to derive essential
features and extensions of the process. First, it is nec-
essary to enhance the processed models with context
information, which gives further insights on how they
are used and whether they need to be updated to fit
their context. Second, stake-holder specific require-
ments need to be considered. Depending on the do-
main, these requirements can be very heterogeneous.
However, they are of great importance to tailor the
process to the specific needs of the use cases. In the
following, we describe the steps of our process, focus-
ing on the update and upgrade loops, as highlighted in
Figure 1. For each step we describe relevant functions
to manage machine learning models.
4.1 Step 1: Plan Model
In this step, it is important to consider specific re-
quirements by the corresponding use case, its domain,
and the involved stakeholders. Making mistakes in
the model planning step can lead to costly re-planning
and to misleading results. Consequently, business re-
quirements need to be defined first, considering the
goals of the desired use cases. Second, stakeholder
workshops need to be organized, in which all involved
stakeholders can verify and evaluate these require-
ments and, if necessary, change them or come up
with new ones. Furthermore, it needs to be evaluated
whether the desired model is feasible, i.e., whether
it is economical, it reaches the goals that need to be
solved and if the necessary data are available. The
planning step is essential for the (economical) success
of the model. If the desired model is not feasible, for
example, because the required data is not available or
the costs for realization are too high, it is discarded as
depicted in Figure 1 after Step 1. Consequently, the
process directly moves to the retirement of the model
(cf. Figure 1, step 6).
Management Considerations. In this step, the
model needs to be semantically enriched with plan-
ning data that is further refined in the upcoming steps
of our process model. The planning data is mainly
related to business concerns and the expected usage
of the model in production. Planning data include
information about, e.g., the corresponding use case,
the prediction that should be made, and the decision
based on the model. A method that can be used to col-
lect planning data via stakeholder workshops is the
machine learning canvas (Dorard, 2019). It enables
to align the domain knowledge of business people,
data scientists and IT experts. However, the machine
learning canvas is rather static and provided as a docu-
ment template. It would be useful to store its contents
and link them to models in order to enable a semantic
search for models. For example, data scientists want
to search models to get a rough idea on how to de-
velop a new model for a similiar use case.
ICEIS 2019 - 21st International Conference on Enterprise Information Systems
418