From Monolithic Systems to Microservices: A Decomposition

Framework based on Process Mining

Davide Taibi

and Kari Syst

TASE - Tampere Software Engineering Research Group, Tampere University. Tampere, Finland

Keywords:

Microservices, Cloud-native, Microservice Slicing, Microservice Decomposition, Microservice Migration.

Abstract:

Decomposition is one of the most complex tasks during the migration from monolithic systems to microser-

vices, generally performed manually, based on the experience of the software architects. In this work, we

propose a 6-step framework to reduce the subjectivity of the decomposition process. The framework provides

software architects with a set of decomposition options, together with a set of measures to evaluate and com-

pare their quality. The decomposition options are identiﬁed based on the independent execution traces of the

system by means of the application of a process-mining tool to the log traces collected at runtime. We validated

the process, in an industrial project, by comparing the proposed decomposition options with the one proposed

by the software architect that manually analyzed the system. The application of our framework allowed the

company to identify issues in their software that the architect did not spot manually, and to discover more

suitable decomposition options that the architect did not consider. The framework could be very useful also in

other companies to improve the quality of the decomposition of any monolithic system, identifying different

decomposition strategies and reducing the subjectivity of the decomposition process. Moreover, researchers

could extend our approach increasing the support and further automating the decomposition support.

1 INTRODUCTION

Legacy and monolithic system have become hard

to maintain because of tight coupling between their

internal components. Modifying a feature in one

class often involves changes in several other classes,

thereby increasing the needed development time and

effort. The decomposition into small and independent

modules is a strategy that companies may adopt to im-

prove maintainability (Parnas, 1972) (Soldani et al.,

2018). Often, at the same time, companies want to

utilize beneﬁts of microservices, such as independent

development, scaling and deployment (Taibi et al.,

2017d).

Microservices are relatively small and au-

tonomous services deployed independently, with a

single and clearly deﬁned purpose (Fowler and Lewis,

2014). Because of their independent deployment,

they have a lot of advantages. They can be developed

in different programming languages, they can scale

independently from other services, and they can be

deployed on the hardware that best suits their needs.

https://orcid.org/0000-0002-3210-3990

https://orcid.org/0000-0001-7371-0773

Moreover, because of their size, they are easier to

maintain and more fault-tolerant since a failure of one

service will not break the whole system, which could

happen in a monolithic system. Since every microser-

vice has its own context and set of code, each mi-

croservice can change its entire logic from the inside,

but from the outside, it still does the same thing, re-

ducing the need of interaction between teams (Taibi

et al., 2017b) (Taibi et al., 2017c).

However, decomposing a monolithic system into

independent microservices is one of the most critical

and complex tasks (Taibi et al., 2017d)(Taibi et al.,

2017e) and several practitioners claim the need for a

tool to support them during the slicing phase in order

to identify different possible slicing solutions (Taibi

et al., 2017d) (Taibi et al., 2018). The decomposi-

tion is usually performed manually by software archi-

tects (Taibi et al., 2017d)(Soldani et al., 2018). Up to

now, the only help that software architects can have is

based on the static analysis of dependencies with tools

such as Structure 101

while the slicing of the sys-

tem commonly is delegated to the experience of the

Structure101 Software Architecture Environment -

http://www.structure101.com

Taibi, D. and Systä, K.

From Monolithic Systems to Microservices: A Decomposition Framework based on Process Mining.

DOI: 10.5220/0007755901530164

In Proceedings of the 9th International Conference on Cloud Computing and Services Science (CLOSER 2019), pages 153-164

ISBN: 978-989-758-365-0

153

software architect itself. Moreover, static dependency

analysis tools are not able to capture the dynamic be-

havior of the system and run-time dependencies like

frequent method calls could have an inﬂuence to both

maintainability and performance. Thus, we decided

to approach the slicing based on runtime behavior in-

stead of only considering static dependencies.

In order to ease the identiﬁcation of microser-

vices in monolithic applications, we adopted a data-

driven approach for identifying microservices candi-

dates based on process mining performed on log ﬁles

collected at runtime. Our approach combines process

mining techniques and dependency analysis to recom-

mend alternative slicing solutions. Our decomposi-

tion approach can be used by software architects to

support their decisions and to help them easily iden-

tify the different business processes in their applica-

tions and their dependencies, reducing the subjectiv-

ity and the risks of related to the slicing process.

We validated this work with an industrial case

study performed in collaboration with an SME that

we supported in the migration phase, comparing the

decomposition solution proposed by the software ar-

chitect with the one obtained from the application of

our process-mining based approach.

The results show that process mining can be effec-

tively used to support the decomposition of microser-

vices and that it also supports the identiﬁcation of ex-

isting architectural issues in monolithic systems. The

result can be used by companies to reduce the risk of

a wrong slicing solution, suggesting different slicing

options to the software architects and providing addi-

tional analysis of the software asset.

This paper is structured as follows. Section 2

presents the background on processes for migrating

and splitting monolithic systems into microservices.

Section 3 describes our proposed approach. Section

4 reports on the industrial case study. Section 5 dis-

cusses the results, while Section 6 draws conclusions.

2 BACKGROUND AND RELATED

WORK

Decomposing a system into independent subsystems

is a task that has been performed for years in soft-

ware engineering. Parnas (Parnas, 1972) proposed the

ﬁrst approach for modularizing systems in 1972. Af-

ter Parnas’s proposal, several works proposed differ-

ent approaches (Lenarduzzi et al., 2017a). Recently,

the decomposition of systems took on another dimen-

sion thanks to cloud-native systems and especially

microservices. In microservices, every module is de-

veloped as an independent and self-contained service.

2.1 The Microservice Decomposition

Process

Taibi et al. (Taibi et al., 2017d) conducted a survey

among 21 practitioners who adopted microservices at

least two years ago in order to collect their motivation

for, as well as the evolution, beneﬁts, and issues of the

adoption of microservices. Based on the results, they

proposed a migration process framework composed

of two processes for the redevelopment of the whole

system from scratch and one process for creating new

features with a microservice architecture on top of the

existing system. They identiﬁed three different pro-

cesses for migrating from a monolithic system to a

microservices-based one. The goal of the ﬁrst two

processes is to support companies that need to migrate

an existing monolithic system to microservices by re-

implementing the system from scratch. The aim of the

third approach is to implement new features only as

microservices, to replace external services provided

by third parties, or to develop features that need im-

portant changes and therefore can be considered as

new features, thus gradually eliminating the existing

system. All three of the identiﬁed processes are based

on four common steps but differ in the details.

• Analysis of the System Structure. All processes

start by analyzing dependencies mainly with the

support of tools (Structure101, SchemaSpy

, or

others)

• Deﬁnition of the New System Architecture. Archi-

tectural guidelines or principles, and proposal of a

decomposition solution into small microservices

are deﬁned. The decomposition is always done

manually.

• Prioritization of feature/service development. In

this step, all three processes identify and prioritize

the next microservices to be implemented. Some

processes prioritize microservices based on cus-

tomer value; others according to components with

more bugs; and yet others prioritize the develop-

ment of new features as microservices, expecting

that, in the long run, the new ecosystem of mi-

croservices will gradually replace each feature of

the existing monolith.

• Coding and Testing are then carried out like any

other software development project. Developers

adopt the testing strategy they prefer. However, in

some cases, testing of the different microservices

is performed by doing unit testing at the microser-

vices level and black-box testing at the integration

level.

http://schemaspy.sourceforge.net/

CLOSER 2019 - 9th International Conference on Cloud Computing and Services Science

154

In this work, we focus mainly on the ﬁrst two

steps, supporting companies in the analysis of the sys-

tem structure and in the identiﬁcation of decomposi-

tion alternatives. The architectural guidelines should

be deﬁned by the company based on their internal

policies.

2.2 Proposed Approaches for

Identifying Microservices

Only a limited set of research works propose ap-

proaches aimed at supporting developers in decom-

posing their systems into an optimal set of microser-

vices.

Abbott and Fischer (Abbott and Fisher, 2015) pro-

posed a decomposition approach based on the ”scala-

bility cube”, which splits an application into smaller

components to achieve higher scalability. Richard-

son (Richardson, 2017) also mentioned this approach

in his four decomposition strategies:

• ”Decompose by business capability and deﬁne

services corresponding to business capabilities”;

• ”Decompose by domain-driven design sub-

domain”;

• ”Decompose by verb or use ‘cases’ and deﬁne ser-

vices that are responsible for particular actions”;

• ”Decompose by nouns or resources by deﬁning

a service that is responsible for all operations on

entities/resources of a given type”.

The ﬁrst two strategies are mostly abstract pat-

terns of human decisions (Yourdon and Constantine,

1979) while the others are based on predeﬁned crite-

ria. Kecskemeti et al. (Kecskemeti et al., 2016) pro-

posed a decomposition approach based on container

optimization. The goal is to increase the elasticity of

large-scale applications and the possibility to obtain

more ﬂexible compositions with other services.

Arndt and Guercio suggest decomposing a mono-

lith system using a layered architecture style, with the

outcome being highly cohesive and loosely coupled

services, such as representation and business services.

Another possibility is to start from a monolithic sys-

tem and progressively move towards a microservices-

based architecture (Zimmermann, 2017) or deliver-

ing separate microservices by splitting a development

team into smaller ones responsible for a limited group

of microservices.

Vresk et al. (Vresk and Cavrak, 2016) deﬁned

an IoT concept and platform based on the orches-

tration of different IoT system components, such as

devices, data sources, data processors, and storage.

They recommend combining verb-based and noun-

based decomposition approaches. The proposed ap-

proach hides the complexity stemming from the vari-

ation of end-device properties thanks to the applica-

tion of a uniform approach for modeling both physi-

cal and logical IoT devices and services. Moreover, it

can foster interoperability and extensibility using di-

verse communication protocols into proxy microser-

vice components. Gysel et al. (Gysel et al., 2016)

proposed a clustering algorithm approach based on

16 coupling criteria derived from literature analysis

and industry experience. This approach is an exten-

sible tool framework for service decomposition as a

combination of a criteria-driven methods. It inte-

grates graph clustering algorithms and features pri-

ority scoring and nine types of analysis and design

speciﬁcations. Moreover, this approach introduces

the concept of coupling criteria cards using 16 dif-

ferent instances grouped into four categories: Cohe-

siveness, Compatibility, Constraints, and Communi-

cations. The approach was evaluated by integrating

two existing graph clustering algorithms, combining

actions research and case study investigations, and

load tests. The results showed potential beneﬁts to

the practitioners, also conﬁrmed by user feedback.

Chen et al. (Chen et al., 2017) proposed

a data-driven microservices-oriented decomposition

approach based on data ﬂow diagrams from busi-

ness logic. Theyr approach could deliver more ratio-

nal, objective, and easy-to-understand results thanks

to objective operations and data extracted from real-

world business logic. Similarly, we adopt process

mining to analyze the business processes of a mono-

lithic system.

Alwis et al. (De Alwis et al., 2018) proposed a

heuristic to slice a monolithic system into microser-

vices based on object subtypes (i.e., the lowest gran-

ularity of software based on structural properties) and

functional splitting based on common execution frag-

ments across software (i.e., the lowest granularity of

software based on behavioral properties). This ap-

proach is the closer to our work. However, they an-

alyzed the system by means of static analysis with-

out capturing the dynamic behavior of the system and

they did not propose measures to evaluate the quality

of the slicing solution proposed.

Taibi et al., proposed a set of patterns and anti-

patterns that should be carefully considered during the

microservice decomposition (Taibi and Lenarduzzi,

2018) (Taibi et al., 2019) recommending to avoid a

set of harmful practices such as cyclic dependencies

and hard-coded endpoints but also to consider critical

anti-patterns and code smells (Taibi et al., 2017a) that

can be generated into the monolithic system.

From Monolithic Systems to Microservices: A Decomposition Framework based on Process Mining

155

3 THE DECOMPOSITION

FRAMEWORK

Applications built from microservices should be as

decoupled and as cohesive as possible (Fowler and

Lewis, 2014). In the case of loosely coupled services,

changes to one service should not require changes to

other services. Therefore, the developers of microser-

vices can change and deploy their microservices inde-

pendently. As reported by Sam Newman (Newman,

2015), “a loosely coupled service knows as little as

it needs to about the services with which it collabo-

rates.”. Therefore, developers should limit the number

of different types of calls from one service to another.

Cohesion is the degree to which the elements of

a certain class belong together. It is a measure of

how strongly related each piece of functionality of

a software module is (Fenton and Bieman, 2014).

High cohesion makes the reasoning easy and limits

the dependencies (Kramer and Kaindl, 2004). Low

coupling is commonly correlated with high cohe-

sion (Kramer and Kaindl, 2004) (Jabangwe et al.,

2015). In microservices-based systems, low cohesion

is achieved by grouping common business processes

together, so that, if developers need to change the be-

havior, they need to change only a single microser-

vice (Newman, 2015). Practitioners commonly ana-

lyze dependencies with tools such as Structure 101.

However, while dependency analysis tools can sup-

port the identiﬁcation of static dependencies, they do

not enable the identiﬁcation of the full execution path.

Our approach combines process mining techniques

and dependency analysis to recommend alternative

slicing solutions. In the next sub-sections, we report

the underlying assumptions of our approach and the

different steps that compose the decomposition pro-

cess.

3.1 Assumptions

The approach requires the availability of an extended

version of a log trace collected at runtime. For each

user operation performed from the user interface (e.g.,

clicking on a button), or from any entry point of a

system (e.g., APIs or command line), all the activities

must be traced from the log ﬁles. Information about

each class and method that is traversed for the execu-

tion of the operation must be included. The complete

execution must be traced completely from the entry

point (a click on a submission form or the beginning

of an operation) to the access to the database (if any)

and to the results returned to the client. An example

of data reported in the log ﬁle is shown in Table 2.

In this step, we instrument the software to produce a

log. The log trace must include events that indicate

entrance and exit of functions as well as database ac-

cesses. Each entry in the log also has a timestamp and

a session ID, to distinguish between the usage of the

same class or method from different users. The log

trace could be collected by instrumenting the source

code with Aspect Oriented Programming, by adding

the log trace into each method or with existing appli-

cations such as the Elastic APM

or similar, or adopt-

ing an approach similar to the one applied in (Suon-

syrj

a, 2015). In case the data collection is not yet in

place, we recommend to use Elastic APM, since it al-

lows to easily instrument the code with a minor effort.

For some languages (e.g. Java and Node.js) the instru-

mentation requires the addition of one line of code to

the application conﬁguration, specifying the type of

log trace required and the logging server URL.

3.2 The Decomposition Process

Once the log ﬁles are created, companies can start

the decomposition following our 6-step process (Fig-

ure 1).

Step 1: Execution Path Analysis

In the ﬁrst step, we identify the most frequently used

execution paths with a process-mining tool. In our

case, DISCO

was used to graphically represent

the business processes by mining the log ﬁles. The

same result can be obtained by any other alternative

process-mining tool. The result is a graphical repre-

sentation of the processes, reporting each class and

database table used in the business processes, with a

set of arrows connecting each class based on the log

traces. The result of this ﬁrst step produces a ﬁgure

similar to the one presented in Figure 2, that allows to

understand:

• Runtime execution paths of the system. Paths

never used, even if possible, are not represented

in the ﬁgure.

• Dependencies between the classes of the system.

The arrows represent the dependencies between

methods and classes. External dependencies to li-

braries or web-services are also represented.

• The frequency of usage of each path. Process

mining tools present the most used processes with

thicker arrows

• Branches and Circular Dependencies. The graph-

ical representation allows easy discovery of cir-

cular dependencies or branches (e.g., conditional

The Elastic APM Libraries. https://www.elastic.co/

solutions/apm

https://ﬂuxicon.com/disco/

CLOSER 2019 - 9th International Conference on Cloud Computing and Services Science

156

Execu ti on P ath

Analysis

Frequency

analysis

Circular

Depend ency

Removal

Deco mpo si tion

Options

Metric-

based

ran ki n g

Selection o f

Deco mpo si tion

Sol utio n

Figure 1: The Decomposition Process.

Table 1: Frequency analysis of each execution path.

Path Freq.

A.a(); A.b(), B.b(), C.c(), DB.query, Table

A, Table B, . . .

1000

A.b(); A.c(), B.a(), C.c(), DB.query, Table

A, Table B, . . .

150

Figure 2: Simpliﬁed Process Example.

statement that led to different path based on the

input provided), in case they exist.

The complete chain of arrows forms a candidate of a

process. Figure 2. represents a simpliﬁed example of

one business process representing the data reported in

Table 2.

Step 2: Frequency Analysis of the Execution Paths

The thickness of the arrows created by the DISCO

tool indicates the frequency of the calls between

classes. This makes it possible to clearly understand

which execution path are used most frequently and

which classes are rarely or never used during runtime.

The output of this step is a table representing all the

Table 2: Example of Log Traces (Timestamps are shortened

for reasons of space).

Start

Time

End

Time

Sess.ID Class Method

00:00 00:36 S1 Form.jsp btnClick()

01:00 01:39 S1 A.java a()

01:40 01:45 S1 A.java b()

01:45 01:55 S1 B.java b()

01:56 02:05 S1 B.java c()

02:05 02:13 S1 DB.java query()

02:14 02:21 S1 DB TABLE A

02:22 03:28 S1 DB TABLE B

02:29 02:36 S1 B.java c()

02:36 02:45 S1 vB.java b()

02:46 02:55 S1 A.java b()

02:56 03:03 S1 A.java c()

03:04 03:16 S1 Results.jsp render()

Figure 3: Simpliﬁed Process Example.

different execution paths with the frequency of their

usage.

Step 3: Removal of Circular Dependencies

In this step, we ﬁrst ﬁnd circular dependencies by an-

alyzing the execution paths reported in the table gen-

erated in the ﬁrst Step (e.g. Table 2). This can be

done with a simple algorithm to discover cycles in the

execution paths. In the case of circular dependencies,

software architects should discuss with the develop-

ment team how to break these cycles. One example

of the patterns that can be applied to break the cycles

is Inversion of Control (Martin, 2003). However, ev-

ery cyclic dependency could need a different breaking

solution that must be analyzed carefully. The result is

a reﬁned version of the execution path table (see Ta-

ble 2 as example).

Step 4: Identiﬁcation of Decomposition Options

Starting with the execution paths without cyclic de-

pendencies obtained from Step 3, we identify dif-

ferent decomposition alternatives by visually inspect-

ing the generated graphs. The candidate processes

may have common sub-paths, i.e., the processes may

merge or split. Thus, different decomposition solu-

tions are possible. This process could also be au-

tomated by developing an algorithm that provides

all different decompositions based on the paths with

fewer intersections. However, in this case, we rely

on the expert-based decomposition. As highlighted in

Figure 3, the decomposition options need to deal with

the duplication of some classes or methods. As ex-

ample, the execution traces reported in Figure 3 show

that both the green and the orange execution traces

use j(). Therefore, software architects could propose

two decomposition alternatives. The ﬁrst option in-

cludes the creation of three microservices where class

E.java() is duplicated in microservice MS2 and MS3.

The second option includes the creation of two mi-

croservices, merging MS2 and MS3. Both options

have pros and cons, but the decision of merging two

execution traces or splitting into different microser-

From Monolithic Systems to Microservices: A Decomposition Framework based on Process Mining

157

vices must be discussed with the team. If two mi-

croservices candidates for the splitting have different

purposes, it is reasonable to consider the splitting. If

they are doing the same thing, then it would be better

to merge them into one single microservice.

Step 5: Metric-based Ranking of the Decomposi-

tion Options

In this step, we identify three measures to help soft-

ware architects to assess the quality of the decompo-

sition options identiﬁed in Step 4: Coupling, Number

of classes per microservices, Number of classes that

need to be duplicated.

Coupling

The decomposition to microservices should minimize

coupling and maximize cohesion. Coupling and co-

hesion can be calculated with different approaches.

While coupling can be obtained from our log traces,

for all the cohesion measures we also need to know

about the access to the local variables of each class,

which makes it impossible to calculate them from the

data reported in the log traces. However, coupling is

commonly considered as inversely proportional to co-

hesion (Jabangwe et al., 2015). Therefore, a system

with low coupling will have a high likelihood of hav-

ing high cohesion (Jabangwe et al., 2015). We deﬁne

the Coupling Between Microservice (CBM) extend-

ing the well-known Coupling Between Object (CBO)

metric proposed by Chidamber and Kemerer (Chi-

damber and Kemerer, 1994). CBO represents the

number of classes coupled with a given class (effer-

ent couplings and afferent couplings). This coupling

can occur through method calls, ﬁeld accesses, inher-

itance, arguments, return types, and exceptions.

We calculate the relative CBM for each microser-

vice as follows:

CBM

Number of external Links

Number of Classes in the Microservice

where “Number Of External Links” represents the

number of calls to external services used in each class

of microservice. An external service linked several

times by different classes of the same microservice is

only counted once. External services could be other

microservices, external APIs, etc.

CBM is calculated for each microservice indepen-

dently and presented in a table for the next step.

Number of Classes per Microservice

This measure helps to understand how big the mi-

croservice identiﬁed is and to identify if there are

microservices that are too big compared to others.

the number of classes should be minimized since the

smaller the number of classes the more independent

its development can be. Considering the example re-

ported in Figure 3, the decomposition option 1 has 7

classes while option 2 has six classes.

Number of Classes that Need To Be Duplicated

In some cases, several classes will be in common be-

tween two execution traces. As example, the method j

in Class E (Figure 3) is used by two execution traces.

In the example depicted in Figure 3, decomposition

option 1 has one class that needs to be duplicated,

while option 2 requires no classes to be duplicated.

This measure helps to reason about the different

slicing options, considering not only the size of the

microservices but also the number of duplications,

that will be then reﬂected in the development of the

microservices. Duplicated classes should be avoided

since the duplication adds to the size of the system

and its maintenance.

Step 6: Selection of the Decomposition Solution

This is the ﬁnal step where, based on the different

decomposition alternatives identiﬁed in Step 4, and

on the measures collected in Step 5, software archi-

tects can decide which solution to adopt by merg-

ing or splitting existing processes. Software archi-

tects could consider the recommendation provided by

our decomposition process and discuss with the team

which solution is most suitable for them, consider-

ing the organizational structure. Our process does not

recommend the best decomposition solution, but pro-

vides a reasoning framework on the possible decom-

position options.

4 VALIDATION: INDUSTRIAL

CASE STUDY

In this section, we validate the decomposition frame-

work proposed in Section III. With this study, we

aim to understand whether our approach can support

developers in easily identifying different slicing op-

tions. For this purpose, we performed an industrial

case study to compare the ad-hoc decomposition so-

lution proposed by the software architect with the so-

lutions proposed by our approach.

According to our expectations, we formulated the

goal of the case study as:

Analyze the proposed decomposition solutions

for the purpose of evaluating and comparing

with respect to the perceived slicing easiness and

usefulness

in the context of the migration of a monolithic system

to microservices

from the point of view of software architects.

We formulated the questions of the case study

as follows and further derived questions and metrics

from them:

CLOSER 2019 - 9th International Conference on Cloud Computing and Services Science

158

RQ1. Q1: Does adopting the proposed decompo-

sition framework ease the identiﬁcation of

different microservices?

RQ2. Q2: What do the developers, software ar-

chitects, and project manager think about

the applicability of this approach?

RQ3. Q3: Are the developers willing to use the

approach in the future?

We answered our questions by surveying the

project manager and the software architect who ﬁrst

manually applied the decomposition process as usual

and then evaluate the decomposition options proposed

by our framework. The measures identiﬁed for the

questions were derived from the Technology Accep-

tance Model (Venkatesh, 2000). All questions were

evaluated based on a 5-point ordinal Likert scale with

the following options: 1 = strongly disagree, 2 = dis-

agree, 3 = neither agree nor disagree, 4 = agree, 5 =

strongly agree.

Q1 – Perceived Ease of Use: Here we aim to

compare the perceived easiness of our approach

with that of the experience-based (ad-hoc) approach.

We adopted the Technology Acceptance Model

(Venkatesh, 2000) to collect measures about the ease

of use of our approach, identifying the following met-

rics:

• The process-mining approach would be easy for

me to use during the decomposition phase.

• It would be easy for me to become skillful at us-

ing the process-mining approach to decompose a

monolithic system.

Q2 - Applicability: What do the participants think

about the applicability of our approach? To answer

this question, we collected the time overhead needed

to perform the process-mining approach. Perceived

usefulness: measures the degree to which the par-

ticipants considered the approach useful for making

project decisions. The evaluated criteria were:

• I am sure that I was able to better decompose the

system with this approach.

• I was able to ﬁnd alternative decomposition strate-

gies.

• I was able to better decompose the system, but the

time required with the new approach is too much

compared to its beneﬁts.

• I was able to better decompose the system, but the

effort needed to trace the information on the log

ﬁle is too much compared to the beneﬁts of the

approach.

• The approach helped me to understand existing

architectural issues in the existing monolithic sys-

tem

Perceived Understandability: measures the effort

needed by the subject to understand the approach built

or whether the participants will need to exert little

effort to understand the relationship with the system

concepts.

• It was easy for me to understand how the approach

works. Perceived easiness: measures the degree

to which the subject believed that he or she was

able to make project decisions easier than without

the approach.

• It was easy for me to identify decomposition op-

tions with the support of this approach.

• I was able to identify decomposition options with

less effort compared to the ad-hoc manual decom-

position.

• I was able to identify decomposition options more

accurately.

Self-efﬁcacy by Applying the Technique: The per-

ceived ability to decompose a monolithic system into

microservices by means of our proposed approach.

• It was easy for me to keep an overview of the

project and of the different decomposition op-

tions.

• The approach helped me to increase the quality of

the decompositions.

Q3 - Willingness to Use our Approach in the Fu-

ture: With this question, we aim to understand

whether the company would be willing to use our sys-

tem in the future. We collected this measure with the

following question:

• I will adopt this approach in the future.

Table 3 report the list of questions and the results

of this study.

4.1 Study Context

The approach was applied in an SME in Milan (Italy).

The company develops a document management sys-

tem for bookkeeping, for Italian tax accountants. The

goal of the system is to make it possible to manage the

whole bookkeeping process, including management

of the digital invoices, sending the invoice to the Min-

istry of Economic Development, and fulﬁlling all the

legal requirements, which usually change every year.

The system is developed by two teams of 4 de-

velopers, plus two part-time developers following the

moonlight scrum process (Taibi et al., 2013), the soft-

ware architect and a project manager. currently being

used by more than 2000 tax accountants, who need

to store more than 20M invoices per year. The system

has been developed for more than 12 years and is now

composed of more than 1000 Java classes.

From Monolithic Systems to Microservices: A Decomposition Framework based on Process Mining

159

Table 3: The Questionnaire adopted in this study - Results.

Questions

Metrics

Project

Manager

Software

Architect

Q1 – Perceived ease of use

The proposed approach would be easy for me to use in the

decomposition phase.

It would be easy for me to become skillful at using the process-mining

approach to decompose a monolithic system.

Q2 -

Applicability

Perceived

usefulness

I am sure that I was able to better decompose the system with this

approach.

I was able to find alternative decomposition strategies.

I was able to better decompose the system but the time required with the

new approach is too much compared to its benefits.

I was able to better decompose the system but the effort needed to trace

the information on the log file is too much compared to the benefits of

the approach.

The approach helped me to understand existing architectural issues in the

monolithic system.

Perceived

understandability

It was easy for me to understand how the approach works.

Perceived

easiness

It was easy for me to identify decompositions options with the support of

this approach.

I was able to identify decomposition options with less effort compared to

the ad-hoc manual decomposition.

I was able to identify decomposition options more accurately.

Self-efficacy

It was easy for me to keep an overview of the project and of the different

decomposition options.

The approach helped me to increase the quality of the decompositions.

Q3 - Willingness to use our

approach

I will adopt this approach in the future.

Questions evaluated with 5-point ordinal Likert scale: 1 = strongly disagree, 2 = disagree, 3 = neither agree nor disagree, 4 = agree, 5 =

strongly agree.

The Italian government usually updates the book-

keeping process between December and January of

every year, which involves not only changing the tax

rate but also modifying the process of storing the in-

voices. However, tax declarations can be made start-

ing in March/April of each year. Therefore, in the best

case, the company has between two to four months to

adapt their software to the new rules in order to enable

tax accountants to work with the updated regulations

from March/April.

Up to now, the company used to hire a con-

sultancy company to help them during these three

months of fast-paced work. However, since the sys-

tem is growing year after year, they decided to mi-

grate to microservice to facilitate maintenance of the

system (Saarim

aki et al., 2019) and to distribute the

work to independent groups, reducing communica-

tion needs and supporting fast delivery (Taibi et al.,

2017c)(Taibi et al., 2017b).

4.2 Study Execution

We performed this study in three steps:

Step 1. Q1: The software architect manually iden-

tiﬁed the different microservices and a de-

composition strategy.

Step 2. Q2: We applied our 6-steps process to

identify different decomposition solutions

and then we compared them to the solution

proposed by the software architect.

Step 3. Q3: The software architect and the project

manager provided feedback on the useful-

ness of our approach.

4.3 Case Study Results

With the support of the development team, the soft-

ware architect manually identiﬁed a set of microser-

vices. He ﬁrst drew the dependency graph with Struc-

ture 101.

Then we applied our approach to mine their log

ﬁles. The company already logged all the operations

of their systems with Log4J

, tracing all the informa-

tion reported in Table 2, together with other informa-

tion such as the user involved in the process, the ID

of the customer with which the tax accountant is cur-

rently working, and other information related to the

current invoices.

From this information, we identiﬁed 39 differ-

ent business processes with DISCO. For conﬁdential-

ity reasons, we can only disclose a high-level and

anonymized portion of the system. Figure 4 depicts

an example of two business processes (save invoice

and view invoice). We then calculated the frequency

of each process. DISCO can automatically draw

thicker arcs between processes, thereby simplifying

the approach.

Of 39 processes, three processes had been used

only three times during one year of logging, 17 pro-

https://logging.apache.org/log4j

CLOSER 2019 - 9th International Conference on Cloud Computing and Services Science

160

Vali date

InvoiceForm.submitButtonClick

MainInvoice.creat eInvoice()

ValidateInvoice.validat e()

MainInvoice.save()

MainInvoice.sendToMinistry()

SDI Webservice

(Ministry of Finance Information Syst em)

INVOICE

ACCO UNTANT

CUST OMER

MainInvoice.updateStatus() AmazonGlacier.storePermanently()

DBManager. read()

InvoiceForm.getInvoiceDetails()

MainInvoice.get Invoice()

DBManager.save()

DBManager

WebApp

MainInvoice

First Microservice Second Microservice

Figure 4: The Proposed Slicing Options – (Simpliﬁed ex-

ample).

cesses less than 1000 times, and 12 processes between

1000 and 10K times; ﬁve processes had been used

more than 60B times. This distribution was initially

expected for most of the processes. The most fre-

quently used processes were due to the submission of

the invoices from each tax accountant.

Based on the previous step, the software architect,

together with the project manager, identiﬁed three

slicing options, taking care to avoid circular depen-

dencies between services, especially in the case of

the three classes that suffered from an existing cir-

cular dependency. Therefore, we proceeded with the

calculation of cohesion and coupling of the different

slicing options. The CBM was counted from the out-

going calls reported in the output of DISCO tool.

The ﬁrst solution was composed of 25 microser-

vices (19 classes duplicated) with an average CBM of

0.16; the second solution was composed of 27 mi-

croservices (21 classes duplicated) with an average

CBM of 0.16; while the third solution was composed

of 21 microservices (14 classes duplicated) with an

average CBM of 0.22. Table 4 shows the measures

collected for ﬁve microservices of each decomposi-

tion solution.

The ﬁrst initial result is the list of classes, meth-

ods, and database tables used in the different pro-

cesses together with their frequency of usage. More-

over, the solution proposed by the software archi-

tect had higher coupling and was more complex than

the ones identiﬁed with the process-mining approach.

The analysis of log ﬁles also shows that some pro-

cesses were used much more than expected while one

process traverses an unexpected set of classes per-

forming an incorrect process. Moreover, they also

discovered three unexpected circular dependencies

and the presence of two harmful code smells (Taibi

et al., 2017a).

One of the most important decision drivers that

lead to the selection of one of the three identiﬁed so-

lutions, was the type of decomposition. One of the

selected decomposition options proposed to slice the

system based on the need of creating more shared li-

braries. Shared libraries commonly increase the com-

plexity of the migration, and increase the maintenance

complexity. The other reason was related to the de-

veloper’s knowledge and the code ownership. The se-

lected solution allowed to split the systems reducing

the need of re-organizing the teams and re-allocating

the responsibility of the code. The developers pre-

ferred to migrate the code they have written in the

past, instead of migrating the code written by other

developers.

5 DISCUSSION

In this work we proposed a decomposition process to

slice monolithic systems into microservices based on

their runtime behavior.

The main beneﬁt of analyzing runtime informa-

tion is the availability of the data on the usage of each

component, together with the dynamic analysis of de-

pendencies. We identiﬁed several dead methods and

classes that were never used at runtime and we also

spotted some cyclic dependencies. The static analy-

sis of dependencies would have spotted the circular

dependencies but not all the dead code. Moreover,

thanks to the information obtained from the frequency

of usage of each method, we also better understood

which feature is used more, we prioritized the devel-

opment and the slicing of the monolithic system dif-

ferently. Without the information on the frequency of

usage of methods, we could have created a microser-

vice that would have done most of the computational

tasks.

We are aware about possible threats to validity.

We tried to reduce them by applying a common pro-

cess mining tool (DISCO), that has been developed

for several years and has been adopted by several

companies and universities. However, the tool could

have identiﬁed some processes incorrectly. More-

over, we are aware about the complexity related to

the data collection, since to adopt our process, com-

From Monolithic Systems to Microservices: A Decomposition Framework based on Process Mining

161

Table 4: Decomposition metrics for the decomposition options.

Decomposition Solutions

Solution 1 Solution 2 Solution 3

CBM #Links #Classes #Dupl.

Classes

CBM #Links #Classes #Dupl.

Classes

CBM #Links #Classes #Dupl.

Classes

MS1 0.08 6 75 2 0.26 7 27 0 0.13 5 39 2

MS2

0.29 9 31 4 0.33 11 33 2 0.26 7 27 0

MS3 0.08 2 25 0 0.06 2 33 1 0.33 10 31 4

MS4 0.16 7 43 2 0.08 4 50 3 0.17 7 41 3

MS5 0.17 5 30 0 0.18 10 56 0 0.14 4 28 0

panies need to instrument their code to collect the log

ﬁles at the method level. About the generalizability of

the results, the validation case study was based on an

analysis of the processes of one company. The project

manager and the software architect had a limited ex-

perience decomposing systems into microservices but

the authors of this paper have more than four years of

experience in supporting companies in decomposing

systems into microservices and closely followed them

during the migration.

Companies could beneﬁt from our lessons

learned, by applying this process to decompose their

monolithic system, but also monitoring the runtime

behaviors or existing microservices to continuously

understand possible issues. However, despite this ap-

proach being very beneﬁcial in our company, the re-

sults could have a different impact on other compa-

nies. Researchers can beneﬁt from this approach and

extend it further. New optimization metrics could

be deﬁned, and in theory, it would be possible to

propose an automated decomposition approach that

would identify the slices by maximizing the metrics

identiﬁed. Genetic algorithms could be a possible so-

lution for this idea.

6 CONCLUSION

The decomposition of monolithic systems into mi-

croservices is a very complex and error-prone task,

commonly performed manually by the software archi-

tect.

In our work, we demonstrated the usefulness of

existing process-mining approaches for decomposing

monolithic systems based on business processes iden-

tiﬁed from the process-mining approach.

Our goal is not to create a tool to support the auto-

mated slicing, but to provide an extra support to soft-

ware architect, to help them in identifying different

slicing options reducing the subjectivity.

We ﬁrst proposed a simple process-mining ap-

proach to identify business processes in an existing

monolithic solution based on three steps. In the ﬁrst

step, a process-mining tool (DISCO or similar) is used

to identify the business processes. In the second step,

processes with common execution paths are clustered

and a set of microservices is proposed based on busi-

ness processes with similar behavior, paying attention

to not include circular dependencies. In the third step,

we propose a set of metrics to evaluate the decompo-

sition quality.

We validated our approach in an industrial case

study. The software architect of the SME together

with the project manager identiﬁed a decomposition

solution and asked our consultancy to assess it and to

identify other possible decomposition options. This

enabled us to compare our process-mining approach

with the decomposition solution they proposed.

As a result, we found that our process simpliﬁed

the identiﬁcation of alternative decomposition solu-

tions, and provided a set of measures for evaluat-

ing the quality of the decomposition. Moreover, our

process-mining approach keeps track of the classes

and methods traversed by each process, which does

not only help to identify business processes but also

makes it possible to discover possible issues in the

processes, such as unexpected behavior or unexpected

circular dependencies.

In case log data is available, or in case it is possi-

ble to collect logs, we highly recommend that compa-

nies planning to migrate to microservices should use

it, considering the very low effort needed to identify

alternative solutions with our approach (less than two

working days) and the possible beneﬁts that can be

achieved.

Future works include the development of a tool

to facilitate the identiﬁcation of the process, the au-

tomatic calculation of the metrics, and identiﬁcation

of other measures for evaluating the quality of the

decomposition. We are also planning to further em-

pirically validate this approach with other companies

and to include dynamic measures for evaluatinfg the

quality of the system at runtime (Lenarduzzi et al.,

2017b) (Tosi et al., 2012). In the future, we are also

planning to adopt mining software repositories tech-

niques to identify the areas that changed simultane-

ously in the past, to help developers to understand

pieces of code connected to each other.

Another possible future work is to include identi-

CLOSER 2019 - 9th International Conference on Cloud Computing and Services Science

162

ﬁcation of partial migration, i.e., migration of a lim-

ited set of processes from a monolithic system. Fi-

nally, we are also considering to extend this work by

proposing not only different decomposition options

but also a set of patterns for connecting microser-

vices based on existing common microservices pat-

terns (Newman, 2015) (Taibi et al., 2018) and anti-

patterns (Taibi and Lenarduzzi, 2018)(Taibi et al.,

2019).

REFERENCES

Abbott, M. L. and Fisher, M. T. (2015). The Art of Scalabil-

ity: Scalable Web Architecture, Processes, and Orga-

nizations for the Modern Enterprise. Addison-Wesley

Professional, 2nd edition.

Chen, R., Li, S., and Li, Z. (2017). From monolith to mi-

croservices: A dataﬂow-driven approach. In 2017

24th Asia-Paciﬁc Software Engineering Conference

(APSEC), pages 466–475.

Chidamber, S. R. and Kemerer, C. F. (1994). A metrics

suite for object oriented design. IEEE Trans. Softw.

Eng., 20(6):476–493.

De Alwis, A. A. C., Barros, A., Polyvyanyy, A., and Fidge,

C. (2018). Function-splitting heuristics for discovery

of microservices in enterprise systems. In Pahl, C.,

Vukovic, M., Yin, J., and Yu, Q., editors, Service-

Oriented Computing, pages 37–53, Cham. Springer

International Publishing.

Fenton, N. and Bieman, J. (2014). Software Metrics: A

Rigorous and Practical Approach, Third Edition. 3rd

edition.

Fowler, M. and Lewis, J. (2014). Microservices.

Gysel, M., K

olbener, L., Giersche, W., and Zimmermann,

O. (2016). Service cutter: A systematic approach to

service decomposition. In European Conference, ES-

OCC 20162016, pages 185–200.

Jabangwe, R., B

orstler, J., Smite, D., and Wohlin, C.

(2015). Empirical evidence on the link between

object-oriented measures and external quality at-

tributes: A systematic literature review. Empirical

Softw. Engg., 20(3):640–693.

Kecskemeti, G., Marosi, A. C., and Kertesz, A. (2016).

The entice approach to decompose monolithic ser-

vices into microservices. In 2016 International Con-

ference on High Performance Computing Simulation

(HPCS), pages 591–596.

Kramer, S. and Kaindl, H. (2004). Coupling and cohesion

metrics for knowledge-based systems using frames

and rules. ACM Transaction on Software Engineer-

ing Methodologies, 13(3):332–358.

Lenarduzzi, V., Sillitti, A., and Taibi, D. (2017a). Analyz-

ing forty years of software maintenance models. In

39th International Conference on Software Engineer-

ing Companion, ICSE-C ’17. IEEE Press.

Lenarduzzi, V., Stan, C., Taibi, D., and Venters, G. (2017b).

A dynamical quality model to continuously monitor

software maintenance. In 11th European Conference

on Information Systems Management (ECISM).

Martin, R. C. (2003). Agile Software Development: Prin-

ciples, Patterns, and Practices. Prentice Hall PTR,

Upper Saddle River, NJ, USA.

Newman, S. (2015). Building Microservices. O’Reilly Me-

dia, Inc., 1st edition.

Parnas, D. L. (1972). On the criteria to be used in de-

composing systems into modules. Commun. ACM,

15(12):1053–1058.

Richardson, C. (2017). Pattern: Microservice architecture.

Saarim

aki, N., Lomio, F., Lenarduzzi, V., and Taibi, D.

(2019). Does Migrate a Monolithic System to Mi-

croservices Decreases the Technical Debt? arXiv e-

prints, page arXiv:1902.06282.

Soldani, J., Tamburri, D. A., and Heuvel, W.-J. V. D. (2018).

The pains and gains of microservices: A systematic

grey literature review. Journal of Systems and Soft-

ware, 146:215 – 232.

Suonsyrj

a, S. (2015). Designing an unobtrusive analyt-

ics framework for monitoring java applications. In

International Workshop on Software Measurement

(IWSM), pages 160–175.

Taibi, D., Diebold, P., and Lampasona, C. (2013). Moon-

lighting scrum: An agile method for distributed

teams with part-time developers working during non-

overlapping hours. In ICSEA - International Confer-

ence on Software Engineering and Advances. IARIA

XPS Press.

Taibi, D., Janes, A., and Lenarduzzi, V. (2017a). How de-

velopers perceive smells in source code: A replicated

study. Information & Software Technology, 92:223–

235.

Taibi, D. and Lenarduzzi, V. (2018). On the deﬁnition of

microservice bad smells. IEEE Software, 35(3):56–

62.

Taibi, D., Lenarduzzi, V., Ahmad, M. O., and Liukkunen,

K. (2017b). Comparing communication effort within

the scrum, scrum with kanban, xp, and banana devel-

opment processes. In Proceedings of the 21st Inter-

national Conference on Evaluation and Assessment in

Software Engineering, EASE’17, pages 258–263.

Taibi, D., Lenarduzzi, V., Janes, A., Liukkunen, K., and Ah-

mad, M. O. (2017c). Comparing requirements decom-

position within the scrum, scrum with kanban, xp, and

banana development processes. In Agile Processes

in Software Engineering and Extreme Programming,

pages 68–83.

Taibi, D., Lenarduzzi, V., and Pahl, C. (2017d). Processes,

motivations, and issues for migrating to microservices

architectures: An empirical investigation. IEEE Cloud

Computing, 4(5):22–32.

Taibi, D., Lenarduzzi, V., and Pahl, C. (2018). Architec-

tural patterns for microservices: a systematic mapping

study. 8th International Conference on Cloud Com-

puting and Services Science (CLOSER2018).

Taibi, D., Lenarduzzi, V., and Pahl, C. (2019). Microser-

vices architectural, code and organizational anti-

patterns. Springer (in press).

From Monolithic Systems to Microservices: A Decomposition Framework based on Process Mining

163

Taibi, D., Lenarduzzi, V., Pahl, C., and Janes, A. (2017e).

Microservices in agile software development: a

workshop-based study into issues, advantages, and

disadvantages. In XP Workshops, pages 23:1–23:5.

ACM.

Tosi, D., Lavazza, L., Morasca, S., and Taibi, D. (2012).

On the deﬁnition of dynamic software measures. In

ESEM, pages 39–48. ACM.

Venkatesh, V. (2000). Determinants of perceived ease

of use: Integrating control, intrinsic motivation, and

emotion into the technology acceptance model. Infor-

mation Systems Research, 11(4):342–365.

Vresk, T. and Cavrak, I. (2016). Architecture of an in-

teroperable iot platform based on microservices. In

MIPRO, pages 1196–1201. IEEE.

Yourdon, E. and Constantine, L. L. (1979). Structured De-

sign: Fundamentals of a Discipline of Computer Pro-

gram and Systems Design. Prentice-Hall, Inc., Upper

Saddle River, NJ, USA, 1st edition.

Zimmermann, O. (2017). Microservices tenets. Computer

Science - Research and Development, 32(3):301–310.

CLOSER 2019 - 9th International Conference on Cloud Computing and Services Science

164