A New Process Model for the Comprehensive Management of Machine

Learning Models

Christian Weber

, Pascal Hirmer

, Peter Reimann

and Holger Schwarz

Graduate School advanced Manufacturing Engineering, University of Stuttgart, Nobelstraße 12, 70569 Stuttgart, Germany

Institute for Parallel and Distributed Systems, University of Stuttgart, 70569 Stuttgart, Germany

Keywords: Model Management, Machine Learning, Analytics Process.

Abstract:

The management of machine learning models is an extremely challenging task. Hundreds of prototypical

models are being built and just a few are mature enough to be deployed into operational enterprise information

systems. The lifecycle of a model includes an experimental phase in which a model is planned, built and

tested. After that, the model enters the operational phase that includes deploying, using, and retiring it. The

experimental phase is well known through established process models like CRISP-DM or KDD. However,

these models do not detail on the interaction between the experimental and the operational phase of machine

learning models. In this paper, we provide a new process model to show the interaction points of the experi-

mental and operational phase of a machine learning model. For each step of our process, we discuss according

functions which are relevant to managing machine learning models.

1 INTRODUCTION

Companies increasingly try to automate decisions

based on data and analytics (Lueth et al., 2016).

Thereby, they encounter a multitude of diverse ana-

lytical applications that are often based on complex

algorithms and highly scalable processing architec-

tures. In such advanced analytical applications, the

central artifact is a machine learning model, which

is the result of applying a machine learning algo-

rithm to data. The model is then used to infer pre-

dictions for new, unseen data to provide actionable

insights that facilitate manual or automated decision-

making. Thereby, the prediction performance of a

model, e. g., its accuracy, is of paramount importance,

since false predictions may lead to wrong decisions,

which in turn may have a negative impact on the com-

pany’s performance. Altogether, this turns models

into valuable corporate assets that have to be man-

aged actively (Krensky and Hare, 2017). This not

only requires to manage the models themselves, but

also their lifecycle and how the models evolve within

individual phases of this lifecycle.

Related work proposes several process models

providing guidelines for the steps to conduct ana-

lytical applications, e. g., the KDD-Process (Fayyad

et al., 1996) or CRISP-DM (Chapman et al., 2000).

The usual lifecycle phases a machine learning model

encounters in these process models are (1) planning

the model, (2) building and testing it, (3) deploy-

ing it in the real environment, (4) using it for pre-

dictions, and (5) retiring it after some time. While

many process models support loops between the ﬁrst

two phases of planning and building models, they of-

ten consider the remaining phases as strictly sequen-

tial. However, real applications also require addi-

tional loops, especially from the phase of using mod-

els back to other phases. These loops starting from

the usage phase entail a high complexity for managing

models, different model versions, and their prediction

performance.

For instance, a crucial management function is to

maintain the models while they are in use within real

and ever-changing environments. A good example are

manufacturing environments, which are characterized

by dynamics of manufacturing systems (Wuest et al.,

2016). Dynamics like tool wear, parameter settings of

machines, and different materials introduce changes

in data and data distributions, which may result in

false predictions of models. Thus, models have to be

re-built and sometimes re-engineered in sophisticated

ways to adapt to such situations.

In this paper, we introduce a new process model

that supports the overall lifecycle of machine learning

models in a comprehensive way and that especially

considers complex update and maintenance loops as

Weber, C., Hirmer, P., Reimann, P. and Schwarz, H.

A New Process Model for the Comprehensive Management of Machine Learning Models.

DOI: 10.5220/0007725304150422

In Proceedings of the 21st International Conference on Enterprise Information Systems (ICEIS 2019), pages 415-422

ISBN: 978-989-758-372-8

415

ﬁrst-class citizens. We explain the individual steps

of this process model and discuss considerations on

related tasks for managing and maintaining machine

learning models.

The remainder of this paper is organized as fol-

lows. First, we present a motivating scenario to show

the need for a new process model in Section 2. In Sec-

tion 3, we discuss related work about process models

for generating and operating machine learning mod-

els. Next, we introduce our concept for a new pro-

cess model supporting the lifecycle of machine learn-

ing models in Section 4. In Section 5, we summarize

the paper and discuss future work.

2 MOTIVATIONAL SCENARIO

AND RELATED CHALLENGES

In this section, we discuss a motivating scenario from

the manufacturing industry in which model manage-

ment plays a crucial role (Brenner et al., 2018). We

highlight problems in the supporting process of man-

aging the underlying machine learning models. In a

simplistic scenario the model is planned, built, tested,

used, deployed, maintained and, after some time, re-

tired. Participants in this process are the business

unit which delivers domain knowledge for construct-

ing machine learning models, a central analytics unit

which builds the analytical solutions and the IT unit

which integrates and maintains these solutions.

2.1 Scenario

Plan Model. In our manufacturing scenario machines

produce scrap material due to a complex combina-

tion of materials and parameter settings for machines

which are not adequate. During the production pro-

cess a lot of errors are introduced by tool-wear. Some

experienced maintenance workers can detect some

but not all settings that result in problems. Often, the

same problems occur but they are not detected in time.

Therefore, the business unit requests a data-driven so-

lution from the analytics unit to detect possible issues

in time. The goal is to reduce the errors in the pro-

cess through predictive maintenance. The business

unit has a lot of data from past production schedules

which they provide to the analytics unit. The result of

discussions between the Business, IT and Analytics

unit is to build a smartphone app for the maintenance

workers that provides the maintenance orders in time.

In another plant, a similiar solution has already been

developed. As this is unknown to the data scientists

they start a complete re-development.

Build & Test Model. After the data scientists of the

analytics unit acquired domain knowledge via work-

shops with the IT and Business unit, they start work-

ing on the model. Often it takes a long time to under-

stand and prepare the data for model training. After

some time working on different models, they detect

a promising combination of feature engineering, al-

gorithms and hyperparameters. Later, after hitting a

dead-end with some algorithm, they want to re-create

another promising experiment they conducted some

weeks ago. However, they cannot remember the exact

experimental setup and start to build the model from

scratch. Some results show a good accuracy. These

promising models are deployed for testing and after

some time they select the most appropriate model. Fi-

nally, data scientists hand over the ﬁnal scripts and a

model ﬁle to the IT department.

Deploy Model. The IT unit integrates the model into

the webservice and builds the smartphone app. For

this task, they have to reimplement the scripts, be-

cause the model format is not supported by libraries

for the target programming language. A lot of com-

munication efforts are needed between the IT and the

analytics unit to transfer knowledge about building

the model.

Use Model. A long time elapses between the provi-

sioning of the data and the receipt of the solution. The

model is delivered as a smartphone app which moni-

tors anomalies in the process. With the app, mainte-

nance workers handle maintenance jobs in time, pre-

venting down-times and scrap material. But soon,

workers experience that tools are changed for new

ones, even though these have no signiﬁcant wear at

all. After some time, the workers do not trust the

model results anymore. The model lost its accuracy

and produces wrong predictions. Something in the

data has changed. Some time later, a technician di-

agnoses that there is a defective temperature sensor

which delivers false results, resulting in false predic-

tions of the model. The sensor is replaced with a new

one and the model delivers again accurate predictions.

Two weeks later, the model once again produces false

predictions. The technician checks the sensors, but

he cannot detect any problem. The business unit con-

tacts the analytics unit to request maintenance for the

model. The analytics unit knows that changing cir-

cumstances in the production environment are respon-

sible for the degraded model. They refer the busi-

ness unit to the IT unit because they do not see model

maintenance as their responsibility. Instead of search-

ing for the root-cause of changed data, the IT unit de-

cides that it is easier to adapt the model to changes in

the data. The IT experts are confused because knowl-

edge about the concepts of updating these models is

ICEIS 2019 - 21st International Conference on Enterprise Information Systems

416

far beyond their knowledge (Fayyad et al., 2017). For

that reason, they contact the analytics unit and more

communication efforts are necessary. The business

unit is annoyed by long reaction times of the IT and

the analytics unit. Especially because it is unclear

which unit is responsible for maintaining the model.

After some time, the business unit decides to revert to

the old manual solution.

Retire Model. Due to missing knowledge and high

operating costs, models get retired before ﬁnding an

appropriate maintenance strategy. This is often the

case, when knowledge of the overall process of de-

veloping and operating a model is unknown to the in-

volved departments. This leads to high expenditure

of time and money, missunderstandings and a loss of

opportunities. In the next section, we detail some of

these problems.

2.2 Need for Update/Upgrade Loops in

Process Models

Building a machine learning model is considered to

be explorative and ad-hoc. Data scientists try many

combinations of data preprocessing, algorithm selec-

tion, hyperparameter tuning and testing to produce a

model with the desired performance. Due to the high

amount of iterations, we consider that loop as experi-

mental loop. This loop has recently received much at-

tention from the scientiﬁc community regarding func-

tions for managing models (Miao et al., 2017; Schel-

ter et al., 2017; Vartak, 2017). However, these ap-

proaches do not focus on loops that origin from the

operational phase. For example, when a model is de-

ployed and in use, data can change and the model de-

grades after some time. This is considered to be a con-

cept drift (Gama et al., 2014). Concept drifts can have

various reasons. In industrial environments where

predictions are applied to streams of sensor data, sen-

sors of machines can be defective or deliver false val-

ues because of tool wear. Our motivational scenario

shows that the root cause for these issues is often not

detected in time or not detected at all. Thus, a solution

is to adapt the model to the changed situation. This

can be achieved by re-training the model with more

recent data. This means that the model is rebuilt, but

the preprocessing steps for the data, the algorithm and

the hyperparameters remain ﬁxed. The old version

of the model is replaced by the new one. This can

happen multiple times, resulting in what we call an

update-loop. However, if update-loops do not yield

the desired accuracy, more profound changes must be

made. For example, additional data and different pre-

processing steps have to be tested with a different al-

gorithm, leading to a new variant of the model for the

same use case. The model is re-engineered. We con-

sider this as an upgrade loop of the model. We think

it is important to also consider the upgrade and update

loops in the overall lifecycle of a model and according

functions for model management, because stakehold-

ers need to base decisions on these loops. For exam-

ple, they have to decide for or against update/upgrade

strategies, accountability of stakeholders for mainte-

nance, supporting tools and economic viability of the

model.

3 RELATED WORK

In this section, we discuss related work about process

models for the lifecycle of machine learning mod-

els. Existing and more general process models for

data mining projects are the KDD process (Fayyad

et al., 1996) and CRISP-DM (Chapman et al., 2000).

The KDD process describes various steps that have

to be carried out to generate knowledge from data.

The majority of steps describe preprocessing activi-

ties to select, clean, and transform input data. After-

wards, the actual data mining takes place where inter-

esting and previously unknown patterns are derived.

These patterns may be represented as mining or ma-

chine learning models that are descriptive, predictive,

or a combination of both. In a last step, the patterns

and the models are interpreted to gain valuable knowl-

edge regarding the relevant domain-speciﬁc problem.

The KDD process supports various loops among all

these steps. However, it does not give hints on how

to deploy and subsequently use the resulting machine

learning models within a real environment, e. g., for

making predictions when machines may fail in a man-

ufacturing environment. Consequently, it does not de-

scribe how to conduct loops for upgrading or main-

taining the models that are already in the usage phase.

CRISP-DM treats the analysis of the business

problem, the investigation and preparation of data, as

well as building and testing machine learning mod-

els as core process steps. Unlike the KDD process,

CRISP-DM explicitly considers a step, where ma-

chine learning models are deployed within the real en-

vironment. However, it likewise does not cover a us-

age phase and corresponding upgrade or maintenance

loops from this usage phase back to the other steps of

the overall process. In CRISP-DM, knowledge about

the overall data mining process is documentated via

reports at the end of each step. It thereby also covers

monitoring and maintenance plans for updating mod-

els. Such a plan speciﬁes which models require updat-

ing and why. Furthermore, it deﬁnes the trigger of the

update, e. g., regular updates, trigger events, or per-

A New Process Model for the Comprehensive Management of Machine Learning Models

417

formance monitoring. However, the contents are not

further detailed, nor reﬂected in the process model.

A more recent approach is the Team Data Science

Process (TDSP) of Microsoft (Ericson et al., 2017).

This process also includes a deployment step in which

a pipeline is built for the update/upgrade of models.

In comparison to our process, the update and upgrade

loops are not depicted as loops in the process model.

Instead, these are covered in pipeline development.

The most similiar approach that inspired our pro-

cess model is the lifecycle of an analytical model

provided by (Grossman, 2018; Pivarski et al., 2016).

They provide a phase of analytical modeling and an

analytic operations phase in the development and op-

eration of machine learning models. For the up-

date and upgrade of models they describe a cham-

pion/challenger strategy. This strategy is covered by

a patent of (Chengwen, 2012). It contains a paral-

lel process step, which includes building other mod-

els versions, while the current version of the model is

in use. If the new version of the model is more accu-

rate, it will replace the old one. However, the cham-

pion/challenger strategy is not a particular strategy for

triggering the update of a model. Triggering updates

can be achieved through, e.g., manual or periodic re-

training and adaptive machine learning algorithms to

react to concept drifts. The ﬁnal step is to retire the

model and to re-deploy an improved one. In our pro-

cess, we consider retiring the model as a ﬁnal activity

when the model is taken out of service. We consider a

model to consist of multiple model versions that rep-

resent the evolution of the model over time. These

model versions are generated through our introduced

update and upgrade loops.

4 PROCESS MODEL

In this section, we introduce our main contribution, a

new process model for comprehensive machine learn-

ing model management. Loops need to be added to

the process in order to enable reaction on changes in

either the model or in the context the model is applied

to. The process is divided into two phases: (i) the ex-

perimental phase, in which models are planned, built,

and tested, and (ii) the operational phase, in which

models are used, monitored, and, if necessary, re-built

or retired. This is depicted in Figure 1. In a ﬁrst

step, we conducted a thorough analysis in the smart

manufacturing domain depicted by the motivational

scenario (see Section 2), in order to derive essential

features and extensions of the process. First, it is nec-

essary to enhance the processed models with context

information, which gives further insights on how they

are used and whether they need to be updated to ﬁt

their context. Second, stake-holder speciﬁc require-

ments need to be considered. Depending on the do-

main, these requirements can be very heterogeneous.

However, they are of great importance to tailor the

process to the speciﬁc needs of the use cases. In the

following, we describe the steps of our process, focus-

ing on the update and upgrade loops, as highlighted in

Figure 1. For each step we describe relevant functions

to manage machine learning models.

4.1 Step 1: Plan Model

In this step, it is important to consider speciﬁc re-

quirements by the corresponding use case, its domain,

and the involved stakeholders. Making mistakes in

the model planning step can lead to costly re-planning

and to misleading results. Consequently, business re-

quirements need to be deﬁned ﬁrst, considering the

goals of the desired use cases. Second, stakeholder

workshops need to be organized, in which all involved

stakeholders can verify and evaluate these require-

ments and, if necessary, change them or come up

with new ones. Furthermore, it needs to be evaluated

whether the desired model is feasible, i.e., whether

it is economical, it reaches the goals that need to be

solved and if the necessary data are available. The

planning step is essential for the (economical) success

of the model. If the desired model is not feasible, for

example, because the required data is not available or

the costs for realization are too high, it is discarded as

depicted in Figure 1 after Step 1. Consequently, the

process directly moves to the retirement of the model

(cf. Figure 1, step 6).

Management Considerations. In this step, the

model needs to be semantically enriched with plan-

ning data that is further reﬁned in the upcoming steps

of our process model. The planning data is mainly

related to business concerns and the expected usage

of the model in production. Planning data include

information about, e.g., the corresponding use case,

the prediction that should be made, and the decision

based on the model. A method that can be used to col-

lect planning data via stakeholder workshops is the

machine learning canvas (Dorard, 2019). It enables

to align the domain knowledge of business people,

data scientists and IT experts. However, the machine

learning canvas is rather static and provided as a docu-

ment template. It would be useful to store its contents

and link them to models in order to enable a semantic

search for models. For example, data scientists want

to search models to get a rough idea on how to de-

velop a new model for a similiar use case.

ICEIS 2019 - 21st International Conference on Enterprise Information Systems

418

Experimental Analytics Operational Analytics

Accuracy >= Threshold

Accuracy < Threshold

Concept drift

detected or

manual/periodical

trigger

Further development infeasible or not economical

Target

requirements

not met

Infeasible or

not economical

e.g. data not available,

costs too high, …

Accuracy cannot

be achieved,

e.g. data quality

too poor

New

use case

available

Model

retired

Comprehensive Management of Machine Learning Models

Rebuild Model

with new Data &

Test Model

Build & Test

Model

Update Plan

Deploy

Model

Retire Model

Use Model

Monitor Model

Plan Model

Figure 1: Process Model for the comprehensive management of machine learning models. Green: update loop, purple:

upgrade loop, blue: experimental loop.

4.2 Step 2: Build and Test Model

In the next step (cf. Figure 1, step 2), a ﬁrst ver-

sion of the model is built and tested. Depending on

the complexity and the needed processing power, this

could be done either on the local machine or a dis-

tributed environment like Apache Spark

. Typically,

models are built using script-based programing lan-

guages, such as Python or R, using available libraries

for data cleaning, transformation, and machine learn-

ing. Another approach is using sophisticated tools,

such as Rapidminer

or KNIME

, that offer a wide

variation of modules to process, transform, and ana-

lyze data. These tools can typically be used by busi-

ness users without programing or IT experience due

to a mostly graphical modeling approach. The re-

sult is comprised of a set of scripts, cleaned datasets,

and serialized models. In this step, the model is also

tested, e.g., it is veriﬁed if its quality reaches a de-

sired threshold. If the quality is too poor and its qual-

ity cannot be increased, the model is discarded (cf.

Figure 1, step 6). Furthermore, if the model does not

meet the expected target requirements (Figure 1, step

1), it needs to be re-planned, e.g., by involving new

data sources, selecting different preprocessing steps

or trying different algorithms. This is conducted in a

newly inserted step of our extended lifecycle process,

the “Update Plan” step (cf. Figure 1, step 2a).

https://spark.apache.org/

https://rapidminer.com/

https://www.knime.com/

Step 2a: Update Plan

Typically, model building is an explorative and itera-

tive process and considered to be ad-hoc. This leads

to the experimental loop in which step 2a is invoked if

the built and tested model does not meet the require-

ments of the planning phase. In this step, the model

needs to be improved according to the requirements

determined in the planning step of the model lifecy-

cle. This might also require additional input of the

stakeholders to meet the desired requirements of spe-

ciﬁc use cases. After the improvements are identiﬁed,

the model is re-planned and needs to be adapted using

the above mentioned approaches, either tool-based or

script-based. Each adaptation of the model can be re-

garded as an experiment through which a new model

version is created. The model is represented as an ob-

ject which can be serialized and saved to disk. Once

the model is built, it needs to be thoroughly tested

again regarding its quality and fulﬁllment of the tar-

get requirements it aims for. Step 2a is repeated un-

til the model reaches the desired quality and is ready

for deployment or, in the worst case, until it needs to

be discarded due to poor quality. Note that having

many iterations between Step 2 and Step 2a is costly

and requires high effort. Consequently, the planning

and building phase should be conducted thoroughly in

order to reduce such costly re-planning and adaption

steps.

Management Considerations. One challenge is

to facilitate knowledge transfer from previous experi-

ments in order to shorten iterations in the model build-

ing step. Recent supportive concepts that cope with

A New Process Model for the Comprehensive Management of Machine Learning Models

419

this issue can be found in (Miao et al., 2017; Schelter

et al., 2017; Vartak, 2017). Often data scientists face

a dead-end when they iterate on various prototypes of

a model. For that reason they want to roll back to a

previous promising experiment they had not pursued

back then. Moreover, unsuccessful strategies from

past experiments should not be repeated in future ex-

periments. Due to hundreds of different experiments

data scientists cannot remember all past experimental

settings. These typically consist of a combination of

feature engineering, algorithm selection and hyperpa-

rameters that lead to an experimental result (Kumar

et al., 2015). In order to facilitate knowledge trans-

fer from experiments, the result, the experimental set-

tings, and the related artifacts like code, datasets and

model ﬁles must be stored. This results in a need for

tools like MLﬂow

to manage experiments.

4.3 Step 3: Deploy Model

This step is on the edge between the experimental

phase and the operational phase of the process model.

The goal is to deploy the built and tested model in its

target environment (cf. Figure 1, 3). This is also re-

ferred to as model serving (Breck et al., 2017). This

step presumes that the model fulﬁlls the quality re-

quirements and target requirements of the planning

step. There are different possibilities how a model

can be deployed. For example, it can be integrated

into a database, run as a stand-alone application or

it can be provided as a web service, offering its ca-

pabilities as a service. The deployment step can ei-

ther be conducted manually, by setting up the busi-

ness logic, the involved data, and the corresponding

applications through an IT expert, or in an automated

manner. Automated deployment of models and appli-

cations, respectively, can be realized through software

provisioning approaches, such as Docker, or through

the OASIS standard TOSCA (OASIS, 2013a; OASIS,

2013b). Often, multiple promising models are de-

ployed for A/B testing, also referred to as split testing.

Management Considerations. In this step, the

model is represented as a model ﬁle. It contains a seri-

alized version of the ﬁnal model object that was built

in the previous step. For comprehensive model man-

agement one has to keep track of these model ﬁles and

a function for dealing with model objects is required.

Managing Model objects includes to keep track on de-

ployed model objects and to provide means to com-

pare and analyze them. Comparing and analyzing in-

cludes to identify differences between model objects.

For example, one wants to identify which aspects in

https://mlﬂow.org/

the model object changed after conducting two sim-

iliar experiments. This can become very challenging

when the model is represented in heterogenous for-

mats. In order to keep track on where models are de-

ployed, metadata about deployment targets have to be

stored as well.

4.4 Steps 4a/b: Use & Monitor Model

Step 4 depicts the main step of the lifecycle pro-

cess, the use and monitoring of the deployed mod-

els (cf. Figure 1, steps 4a and 4b). After a model

is deployed either integrated in a database, an exist-

ing application, stand-alone or as a service, it is exe-

cuted based on its corresponding business logic (e.g.,

realized through scripts or analytics tools). By com-

paring the results based on the test data, the perfor-

mance can be evaluated. There are two approaches

for predictions: (i) batch predictions, and (ii) real-

time predictions. Batch predictions are done ofﬂine

in batches while realtime predictions are done online

during runtime. While the model is in use, its quality

can decrease over time due to changes in the environ-

ment. More precisely, due to a concept drift, e.g., a

change in the environment that has consequences on

the model itself, the model’s quality decreases. For

example, this can be observed by monitoring changes

in the distribution of input data that are fed into the

model. Consequently, steps need to be taken in order

to react to such occurring concept drifts. These steps

lead to an update of the model (Figure 1, step 5) in

order for it to be ﬁtting to the changed context. The

model is updated and re-deployed before it is used

again.

Management Considerations. A model is updated

when it degrades and an old model version is replaced

with a new one. While the model is in use, it has to be

ensured that a model produces accurate predictions.

If predictions are found inaccurate, it should be pos-

sible to trace back which model version created these

false predictions. Furthermore, when the model ver-

sion is related to the according experiment, errors in

the building step can be identiﬁed and avoided. For

that reason, data across the process steps need to be

linked. This allows to track provenance throughout

the created artifacts in the experimental phase and the

operational phase.

4.5 Step 5: Rebuild and Test Model

As mentioned in the previous step, a model may de-

grade over time and may not meet the accuracy it had

before due to occurring concept drifts. There can can

be two reasons for degrading. The ﬁrst reason is that

ICEIS 2019 - 21st International Conference on Enterprise Information Systems

420

the distribution of the data changes. A prominent ex-

ample is a defect sensor in a machine, which produces

false sensor values (see section 2). This is consid-

ered to be a virtual concept drift (Gama et al., 2014).

The second reason is a real concept drift. This means

that the distribution of the predicted value changes be-

cause of, e.g., a change of interest. To cope with vir-

tual concept drifts, algorithms exist that conduct an

automated concept drift detection. A second option

is to build a more robust model by using ensemble

techniques like bagging and boosting (Gepperth and

Hammer, 2016). A further option is to periodically

update the model, which is conducted in this step. Af-

ter updating the model using new or changed data, it

needs to be tested once again in order to ensure its ac-

curacy. If the accuracy reaches an acceptable thresh-

old (which is depending on the scenario), it can be

directly deployed and used once again. However, if

the re-built model cannot reach this threshold, for ex-

ample, due to an extensive concept drift, it needs to

be planned from the beginning.

Management Considerations. It is important to

log the time needed for rebuilding the model so that

it can be detected how much costs are involved and if

further development is economical. For each rebuild

one can save information about the training time and

the accuracy of the updated model. By comparing the

accuracy of the newly deployed model with the accu-

racy of previous versions of the model, we can rec-

ognize patterns for the degrading of models. For ver-

sion control, every time a new model is built, it needs

to be saved as a new model version. Also, when a

completely new model is built, this model should be

a variant to the original built model. Thus, the active

management of models has to deal with model vari-

ants and models versions that emerge over time.

4.6 Step 6: Retire Model

The ﬁnal step (Figure 1, step 6) represents the end of

the model lifecycle process. It is reached if a model

needs to be retired after building and testing due to

economical issues, poor quality or if the model has

been in use for some time and there is no more use for

it, e.g., due to changed circumstances in the domain

or due to ﬁnishing a project.

Management Considerations. When a model is re-

tired, the according data and metadata of the model

should be archived as reference for future experi-

ments. Especially when legal regulations like the

General Data Protection Regulation (GDPR) (Euro-

pean Parliament, 2016) require to justify past deci-

sions based on the now retired model, archiving is

crucial.

5 SUMMARY AND FUTURE

WORK

In this paper, we introduced a new process model for

the comprehensive management of machine learning

models in order to show the importance of update and

upgrade loops as well as according management func-

tions. The result is a process that has its strengths es-

pecially in environments in which the context changes

regularly and there is an increased probability that

models degrade. This applies especially to the manu-

facturing domain, as described in Section 2. Our pro-

cess helps to understand the required complexity for

updating and upgrading models. This imposes new

challenges for IT architects in implementing compre-

hensive model management as a business capability.

In order to support these efforts, it is essential to con-

duct further research on blueprints for model manage-

ment platforms, frameworks and tools. For this rea-

son, we plan to integrate selected management func-

tions into a platform for managing models in Industry

4.0 environments

ACKNOWLEDGEMENTS

The authors would like to thank the German Research

Foundation for ﬁnancial support of this project as part

of the Graduate School of Excellence advanced Man-

ufacturing Engineering at the University of Stuttgart.

Some work presented in this paper was performed

in the project ’MMP’ as part of the Software Cam-

pus program, which is funded by the German Federal

Ministry of Education and Research (BMBF) under

Grant No.: 01IS17051.

REFERENCES

Breck, E., Cai, S., Nielsen, E., Salib, M., and Sculley, D.

(2017). The ML Test Score: A Rubric for ML Pro-

duction Readiness and Technical Debt Reduction. In

2017 IEEE International Conference on Big Data (Big

Data), pages 1123–1132.

Brenner, D., Weber, C., Lenz, J., and Westkaemper, E.

(2018). Total Tool Cost of Ownership Indicator

for Holistical Evaluations of Improvement Measures

within the Cutting Tool Life Cycle. Procedia CIRP,

72:1404–1409.

Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz,

T., Shearer, C., and Wirth, R. (2000). CRISP-DM 1.0:

Step-by-step data mining guide.

Chengwen, R. C. (2012). Computer-implemented systems

and methods for updating predictive models.

Dorard, L. (2019). The machine learning canvas.

A New Process Model for the Comprehensive Management of Machine Learning Models

421

Ericson, G., Rohm, W. A., Martens, J., Casey, C., Harvey,

B., and Gilley, S. (2017). What is the Team Data Sci-

ence Process?

European Parliament (27.04.2016). General Data Protec-

tion Regulation: GDPR.

Fayyad, U., Piatetsky-Shapiro, G., and Smyth, P. (1996).

The KDD Process for Extracting Useful Knowledge

from Volumes of Data. Communications of the ACM,

39(11):27–34.

Fayyad, U. M., Candel, A., La Ari

no de Rubia, E., Pafka,

S., Chong, A., and Lee, J.-Y. (2017). Benchmarks and

Process Management in Data Science: Will We Ever

Get Over the Mess? In Matwin, S., Yu, S., and Farooq,

F., editors, Proceedings of the 23rd ACM SIGKDD In-

ternational Conference on Knowledge Discovery and

Data Mining - KDD ’17, pages 31–32, New York,

New York, USA. ACM Press.

Gama, J. a.,

Zliobait

e, I., Bifet, A., Pechenizkiy, M., and

Bouchachia, A. (2014). A survey on concept drift

adaptation. ACM Comput. Surv., 46(4):44:1–44:37.

Gepperth, A. and Hammer, B. (2016). Incremental learn-

ing algorithms and applications. In ESANN 2016 :

Bruges, Belgium, April 27-28-29, 2016 : proceedings,

pages 357–368.

Grossman, R. L. (2018). A framework for evaluating the an-

alytic maturity of an organization. International Jour-

nal of Information Management, 38(1):45–51.

Krensky, P. and Hare, J. (2017). Hype Cycle for Data Sci-

ence and Machine Learning, 2017.

Kumar, A., McCann, R., Naughton, J., and Patel, J. M.

(2015). Model Selection Management Systems: The

Next Frontier of Advanced Analytics. ACM SIGMOD

Record, 44(4):17–22.

Lueth, K. L., Patsioura, C., Williams, Z. D., and Kermani,

Z. Z. (2016). Industrial Analytics 2016/2017: The cur-

rent state of data analytics usage in industrial compa-

nies.

Miao, H., Chavan, A., and Deshpande, A. (2017). ProvDB:

Lifecycle Management of Collaborative Analysis

Workﬂows. In Proceedings of the 2nd Workshop

on Human-In-the-Loop Data Analytics, HILDA’17,

pages 7:1–7:6, New York, USA. ACM.

OASIS (2013a). Topology and Orchestration Speciﬁcation

for Cloud Applications.

OASIS (2013b). TOSCA Primer. Online.

Pivarski, J., Bennett, C., and Grossman, R. L. (2016). De-

ploying Analytics with the Portable Format for Ana-

lytics (PFA). In Krishnapuram, B., Shah, M., Smola,

A., Aggarwal, C., Shen, D., and Rastogi, R., edi-

tors, Proceedings of the 22nd ACM SIGKDD Interna-

tional Conference on Knowledge Discovery and Data

Mining - KDD ’16, pages 579–588, New York, USA.

ACM Press.

Schelter, S., B

ose, J.-H., Kirschnick, J., Klein, T., and

Seufert, S. (2017). Automatically Tracking Metadata

and Provenance of Machine Learning Experiments.

Machine Learning Systems Workshop at NIPS.

Vartak, M. (2017). MODELDB: A System for Machine

Learning Model Management Chaminade, CA, USA,

January 8-11, 2017, Online Proceedings. In CIDR

2017, 8th Biennial Conference on Innovative Data

Systems Research, Chaminade, CA, USA, January 8-

11, 2017, Online Proceedings. www.cidrdb.org.

Wuest, T., Weimer, D., Irgens, C., and Thoben, K.-D.

(2016). Machine learning in manufacturing: Advan-

tages, challenges, and applications. Production &

Manufacturing Research, 4(1):23–45.

ICEIS 2019 - 21st International Conference on Enterprise Information Systems

422