Challenges of Business Process Model Improvement after

Reverse Engineering

María Fernández-Ropero, Ricardo Pérez-Castillo and Mario Piattini

Instituto de Tecnologías y Sistemas de la Información, University of Castilla-La Mancha

Paseo de la Universidad 4, 13071, Ciudad Real, Spain

Abstract. Business process models have become one of the most important

assets for companies since an appropriate business process management helps

companies to quickly adapt their processes to changes while their

competitiveness is maintained or even improved. As a consequence, companies

are currently demanding mechanisms to ensure business processes with an

appropriate quality degree. These business process models can be obtained

through reverse engineering from existing information systems. Unfortunately,

reversed models usually have a lower quality degree and may not reflect the

actual business processes exactly. This paper describes all detected challenges

that should be addressed for improving quality of business processes, specially

retrieved by reverse engineering (e.g., missing or non-relevant elements, fine-

grained elements, etc.). This work also suggests an approach to improve

business process models along three phases: repairing, refactoring and semantic

improvement. In addition, some preliminary results about the refactoring stage

are provided using real-life retrieved business process models.

1 Introduction

Business process management allows organizations to be more efficient, more

effective and more readily adaptable to changes than traditional management

approaches. Business processes depict sequences of coordinated business activities as

well as the involved roles and resources that organizations carry out to achieve their

common business goal [1]. They are recognized as one of the most important assets in

an organization due to the competitive advantages that they provide for organizations

[2]. In order to supply the management of business processes they can be represented

by models following standard notations such as BPMN (Business Process Modeling

and Notation) [3].

However, organizations may not have their business process models explicitly or

aligned with current behavior. In these cases, reverse engineering can be used to mine

business process model from existing information system [4]. Nevertheless, the

retrieved business process models by reverse engineering entail some problems that

can affect to their quality degree since every reverse engineering technique implies a

semantic loss [5]. Despite the fact that much academic literature is devoted to identify

challenges presented in business process model discovered by mining process (e.g.,

using event logs [6]) or by hand [7], there are no identified challenges to address in

Fernández-Ropero M., Pérez-Castillo R. and Piattini M..

Challenges of Business Process Model Improvement after Reverse Engineering.

DOI: 10.5220/0004602400670074

In Proceedings of the 1st International Workshop in Software Evolution and Modernization (SEM-2013), pages 67-74

ISBN: 978-989-8565-66-2

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

those business process model retrieved from existing information system, for

example, from source code. This kind of business process models can be incomplete

or can contain non-relevant information, or even may contain ambiguities or

uncertainties that decrease their understandability and, therefore, their quality degree.

In these cases, it is necessary to improve business process model with the aim to

address these quality challenges while making it as similar as possible to the reality

that they represent [8].

For this reason, this paper presents a set of challenges detected in business process

models obtained from reverse engineering. These challenges are been collected after a

literature review and practical experiences with business process models mined from

several real-life information systems. With the purpose to have several retrieved

business process models to analyze, MARBLE [4], a reverse engineering approach

and tool, has been selected to mine them. Moreover, the paper introduces an approach

to address the above challenges. The proposal combines reverse engineering with

other analysis approaches in order to mitigate the semantic loss that reverse

engineering techniques entails. This issue is due to some reverse engineering

techniques are focused on source code and there are more knowledge sources from

which to extract knowledge. Hence, the approach is divided in three stages: repairing,

refactoring and expert-based improvement. Each stage uses additional knowledge

(such as recorded event logs, guidelines, heuristics, and expert decision, among other)

to improve the business process model. Despite refactoring techniques are the most

widely-used solution to improve the quality degree of business process models [9,

10], this work also proposes other two additional stages in order to assist refactoring

and enhance business process models. The work also presents some preliminary

results achieved by using the proposed approach.

The remainder of the paper is organized as follows: Section 2 summarizes the

challenges that retrieved business process models involve. After that, Section 3

introduces the proposed approach in an attempt to address these challenges along

three stages. Afterwards, some results obtained by using the proposed approach will

be shown in Section 4. Particularly, results obtained after refactoring stage are

provided. Finally, conclusions and future works are discussed in Section 5.

2 Challenges in retrieved Business Process Models

This section presents the challenges to address the most common problems identified

in the business process models obtained through reverse engineering. These

challenges are been collected after a literature review and practical experiences with

business process models mined from several real-life information systems. The

selected tool to mine business process model was MARBLE. This tool is an adaptive

framework to recover the underlying business process models from legacy

information system using source code [4]. MARBLE has been applied to several

industrial case studies to recover business processes from a wide variety of legacy

information systems. The conduction of these industrial case studies has enabled the

tool to be improved and the MARBLE technique to be refined. So far, MARBLE has

been used with six legacy systems in all: (i) a system managing a Spanish author

organization; (ii) an open source CRM (Customer Relationship Management) system;

(iii) an enterprise information system from the water and waste industry; (iv) an e-

government system used in a Spanish local e-administration; (v) a high school LMS

(Learning Management System); and finally (vi) an oncological evaluation system

used in Austrian hospitals [11]. All business process models obtained in each case

study (from each of the six systems) were analyzed by experts in order to figure out

common errors that frequently occur. Challenges that retrieved business process

models entail are collected in the following paragraphs.

• Completeness: Business process models mined by reverse engineering models

may not be fully complete due to the data can be distributed in several sources, not

just at the source code itself and therefore they cannot be obtained solely through a

static analysis. Business process models may have missed nodes such as business

tasks, gateways, events and data objects, as well as missed connections such as

sequence flows (between tasks) and association flows (between tasks and data

objects). This loss affects the semantic completeness of the model [12]. All these

missing elements may not have been instantiated at design time and, for that

reason, may not be appear in the business process model. As a consequence, one of

the biggest challenges is to rediscover elements that were not recovered in the

reverse engineering phase, as well as the order among different business activities.

The order between activities is a very issue since complete sequence flows between

activities may not be provided through reverse engineering due to the fact that not

all information can be automatically derived from source code. The start points and

end may not have been defined in the model because there is not enough

information to determine which activities are the beginning or ending of a model or

what task is executed before another [13].

• Granularity: According to the approach proposed by Zou et al. [14], each callable

unit in an information system is considered a candidate business task to be

discovered by reverse engineering. However, existing information systems

typically contain thousands of callable units of different sizes that are usually

considered business tasks that can have different levels of granularity [6, 13] such

as (1) large callable units that support the main business functionalities of the

system (e.g., methods and functions of the domain and controller layer), (2) small

callable units as getter and setter methods in object-oriented programming that

only read and write program variables but perform no real business task, (3) a set

of small callable units that have similar behavior and perform a business task

jointly, or (4) a set of small callable units that can together support another. In that

case, the main task may be considered as father task while small tasks may be

considered as children tasks. With the purpose to address this challenge, a solution

could be taking into account only coarse-grained callable units as candidate to be

business tasks while fine-grained ones are discarded since fine-grained granularity

makes models closer to source code perspective. Nevertheless, the dividing line

between coarse- and fine-grained callable units is unknown. Authors like

Polyvyanyy et al. [15] propose to abstract these business process models to reduce

unwanted details and to represent only the relevant information. These authors

address the issue of different types of granularity by proposing two techniques: (1)

eliminating those small tasks that are considered as irrelevant and (2) grouping

certain tasks into one while the information is preserved.

• Relevance: In contrast with completeness which is related to missing elements,

relevance is related to business elements that have been retrieved erroneously. The

information is considerate non-relevant when it can be removed without losing

information, preserving the behavior. This information (such as activities, events,

etc.) may have been created in compilation time but is not used in execution time,

i.e., these elements do not carry out any business logic in the organization. The

relevance of a business process model is an important aspect since it ensures the

model contains enough elements to convey their information [12]. The challenge

must be addressed by identifying and removing all non-relevant elements in the

business process model while preserving semantics of the relevant parts.

• Uncertainty: The enhancement of the understandability of a business process

model is a challenge given that poor understandability of the model can lead to a

wrong conclusion. Understandability is usually worse in those models that have

been obtained by reverse engineering from existing information systems since

identifiers and names of elements may not be enough descriptive. This is because

many identifiers are inherited from the elements in source code. For example, task

labels usually consist of the concatenation of various capitalized words according

to naming conventions present in most programming approaches [16]. This kind of

names is uncertain but can provide a clue to find more representative task names.

This issue is focused on the interpretability from a language-usage perspective, i.e.

how intuitive is the language used to define the elements of the model. For this

reason, the labeling of elements can affect the model interpretation negatively

when it does not follow an appropriate convention [12]. This challenge should be

addressed by renaming elements of business process models in order to they

faithfully represent the semantics performed actually.

• Ambiguity: Another challenge to be taken into account is ambiguities that may be

present in some business process elements. For example, redundancy faults

sometimes occur during reverse engineering owing to different source code pieces

(e.g., two callable units) lead to build two redundant business tasks that actually are

part of a more complex task (e.g., a business task supported by both callable units).

Ambiguity is important to determinate the quality of business process models since

it affects the understandability and modifiability of the model, i.e., how far

elements in the model are intuitively formulated [12]. The ambiguity, therefore,

affects the ability to communicate efficiently the behavior of the business process

negatively. A model is considerate unambiguous when it is free of redundancies

and it contains no elements that contradict the logic of other element. The

ambiguity must be addressed by detecting and removing redundancies and

inconsistencies in a business process model.

3 Business Process Model Improvement Approach

In order to address the challenges outlined above, this paper presents an approach for

improving business process models obtained from information systems with the aim

that they reflect as faithfully as possible the business reality with optimal levels of

quality. This approach proposes three stages: repairing, refactoring and expert-based

improvement. Each stage addresses some challenges above mentioned and uses some

knowledge sources to carry out its purpose. Fig. 1 symbolizes the horseshoe model

that characterizes the reengineering, where the upper stages (refactoring and expert-

based improvement) represent a higher abstraction level than the bottom stage

(repairing).

Repairing stage is considerate in reverse engineering level since it uses knowledge

sources such as recorded event logs to address the completeness challenge. The aim

of this stage is to ensure that business process models reflect the real execution of the

information system. Preliminary results concerning this stage are given in [17]. That

work shows a set of steps that are carried out taking as input a business process model

and event logs and returning as output an enhanced business process model with

additional sequence flows retrieved from event logs. The technique detects

unrecovered sequence flows as regards the event log and tidily adds these sequence

flows to the target business process model. After the conduction of a case study to

demonstrate the feasibility of the technique, the results show that the fitness of the

process model increases, i.e., repairing business process model leads to a more

faithful representation of the observed behavior.

Refactoring stage is concerning to modify the internal structure of business process

models without changing or altering the external behavior. This stage maintains the

abstraction level while maintaining the semantic. Refactoring techniques therefore

improve the quality of business processes, so that they become more understandable,

maintainable and reusable [18]. This stage addresses some challenges as relevancy,

granularity, uncertainty and completeness. Guidelines, literature, heuristics and

experience are additional resources used in this stage. Some refactoring operators are

introduced in [19], especially designed for use with reversed business process models.

For example, some refactoring operators address the relevancy by means of the

elimination of isolated nodes, unnecessary nesting, among other. Other refactoring

operators address the granularity by grouping elements. Other refactoring operators

address the completeness following good practices in business process modeling.

Some results obtained after applying these refactoring operators are shown in next

section. Each refactoring operator is applied in this work in isolation in order to

visualize the change that each operator provides to the business process model.

Finally, expert-based improvement stage addresses ambiguity and relevancy by

means of expert decision. This stage is because not all challenges can be addressed

automatically by the previous two stages, it is necessary also the opinion and

feedback of an expert in certain situations to improve the business process model.

Fig. 1. Proposed improvement approach by means of three stages: repairing, refactoring and

expert-based improvement.

4 Refactoring Results

This section shows some results obtained in the second stage considered in the

approach. In order to illustrate the effect of refactoring operators on business process

models some aspects are defined:

Business process models taken as independent variables have been mined from

the source code using MARBLE, the business process archeology tool used to figure

out the above challenges (cf. Section 2). The selected information system was Tabula,

a web application of 33.3 thousands of lines of code devoted to create, manage and

simulate decision tables for associating conditions with domain-specific actions. From

this information system was retrieved 15 business process models.

Measures used to assess the understandability and modifiability of a business

process model [20] are considered as dependent variables: the size (number of

elements such as tasks, events, gateways and data objects), density (ratio between the

total number of flows in a business process model and the theoretical maximum

number of possible flows regarding the number of elements), and separability (the

ratio between the number of nodes that serve as bridges between otherwise strongly-

connected components and the total number of nodes) of the model.

With the aim to illustrate briefly the result obtained by refactoring stage some

refactoring operators are used from [19]: R1 removes nodes (i.e., tasks, gateways or

events) in the business process model that are not connected with any other node in

order to contribute to the removing of non-relevant elements; Similarly, R2 removes

elements in the business process model that are considered sheet nodes; R6 creates

compounds tasks grouping several small tasks that support another main task. The

goal is to remove the fine-grained granularity; R7 combines data objects that are used

for the same task in order to remove the fine-grained granularity; R8 joins the start

and end event to the starting and ending tasks, respectively, to complete the model;

R9 adds join and split gateways that are not present in branches in an effort to

complete the model.

After the application of each refactoring operator on each business process model

in isolation, values for each dependent variable are collected in Table 1, as well as the

gain obtained with respect to the original value. The gain is defined as the ratio

between the difference of measure values and the original measure value. Hence, a

positive gain means that the refactoring affects the measure positively while a

negative gain means that the refactoring affects the measure negatively. A zero gain

means that the value for a certain measure did not change after refactoring.

Table 1. Effect of each refactoring operator on the size, density and separability.

Size Density Separability

Mean Gain Mean Gain Mean Gain

Original 35.200 0.000 0.110 0.000 15.533 0.000

R1 30.667 0.395 0.196 -0.607 11.000 0.460

R2 34.400 0.011 0.113 -0.023 14.733 0.019

R6 33.267 0.059 0.106 0.031 15.667 -0.006

R7 33.667 0.026 0.106 0.009 14.600 -0.003

R8 37.600 -0.410 0.207 -0.338 17.933 -0.447

R9 59.400 -0.142 0.105 0.076 15.600 -0.002

Table 1 reveals that removing isolated nodes decreases the size and separability

while the density is increased. Despite the density is higher after R1, the relevance of

the model has been increased since non-relevant elements have been removed.

Similarly, R2 causes an increase of density when the size is decreased. Separability is

decreased slightly. R6 creates compound tasks in several business process models.

This fact entails a decrease in the size and density while separability increases

slightly. The same happens with R7, the number of nodes and the density is lower but

separability is higher. Nevertheless, all measures after R8 are higher due to business

process models were incomplete. R9, in turn, cause a significant increase in the size

because there were several incoming and outgoing branches without gateways in the

original business process models. The same occur with the separability after R9 while

density decreases slightly.

5 Conclusions

Reverse engineering has become in a suitable solution to mine business process

model from existing information system. Unfortunately, these retrieved business

process models entail some challenges that are necessary to address in order to

increase their quality degree. Completeness is an important challenge to deal with in

retrieved business process model since data are distributed in several sources.

Different types of granularity are also a challenge to address because fine-granularity

causes the degree of quality is lower. Moreover, non-relevant information causes a

low degree quality since the model should not contain additional elements that do not

carry out any business logic in the organization. The uncertain labeling of elements

may negatively affect the understandability and therefore an appropriate convention

should be followed. In addition, ambiguity is another challenge because a model

should be free of redundancies and inconsistencies.

It is with all the above in mind that this paper presents an approach for improving

business process models obtained from information systems in an effort to deal with

above challenges. The approach defines three stages: repairing, refactoring and

expert-based improvement. These stages address challenges above mentioned by

using additional knowledge sources to perform its goal. Moreover, in order to

illustrate one of the stages, this work presents some results of refactoring stage. The

result shows that the measures selected for assessing the quality of business process

models -in terms of their understandability and modifiability, are improved in the

most of cases by removing non-relevant and fine-grained elements as well as by

completing models. Despite the fact that this work applies refactoring operator in

isolation, studies reveal that refactoring operators do not satisfy commutative property

among them, making necessary to figure out the best execution order [19].

After the completion of this work a set of future works has been identified: (1)

Refining the repairing stage in order to obtain more valuable information from event

logs in order to repair retrieved business process models; (2) Refining the refactoring

stage by defining new refactoring operators to address more challenges. In addition,

the use of more measures for assessing the understandability and modifiability is

required; (3) Definition of expert-based improvement stage by means of the use of

expert decision to remove ambiguities in the business process model.

Acknowledgements

This work was supported by the FPU Spanish Program and the R&D projects MAGO

/PEGASO (Ministerio de Ciencia e Innovación [TIN2009-13718-C02-01]) and

GEODAS-BC (Ministerio de Economía y Competitividad & Fondos FEDER

[TIN2012-37493-C03-01]).

References

1. Weske, M., Business Process Management: Concepts, Languages, Architectures2007,

Leipzig, Germany: Springer-Verlag Berlin Heidelberg. 368.

2. Jeston, J., J. Nelis, and T. Davenport, Business Process Management: Practical Guidelines

to Successful Implementations. 2nd ed2008, NV, USA: (Elsevier Ltd.). 469.

3. OMG. Business Process Modeling Notation Specification 2.0. 2011; Available from: http://

www.omg.org/spec/BPMN/2.0/PDF/.

4. Pérez-Castillo, R., et al., MARBLE. A Business Process Archeology Tool, in 27th IEEE

International Conference on Software Maintenance 2011: Williamsburg, VI. p. 578 - 581

5. Fernández-Ropero, M., R. Pérez-Castillo, and M. Piattini, Refactoring Business Process

Models: A Systematic Review, in ENASE 2012. Wrocław, Poland. p. 140-145.

6. van der Aalst, W., Process Mining: Overview and Opportunities. ACM Transactions on

Management Information Systems (TMIS), 2012. 3(2): p. 7.

7. Indulska, M., et al. Business process modeling: Current issues and future challenges. in

Advanced Information Systems Engineering. 2009. Springer.

8. Fahland, D. and W. M. P.v.d. Aalst, Repairing Process Models to Reflect Reality. 2012.

9. Weber, B. and M. Reichert, Refactoring Process Models in Large Process Repositories, in

Proceedings of the 20th international conference on Advanced Information Systems

Engineering2008, Springer-Verlag. p. 124-139.

10. Dijkman, R., M. L. Rosa, and H.A. Reijers, Managing large collections of business process

models—Current techniques and challenges. Computers in Industry, 2012. 63(2): p. 91.

11. Pérez-Castillo, R., et al., A family of case studies on business process mining using

MARBLE. Journal of Systems and Software, 2012. 85(6): p. 1370-1385.

12. Overhage, S., D.Q. Birkmeier, and S. Schlauderer, Quality Marks, Metrics, and

Measurement Procedures for Business Process Models. Business & Information Systems

Engineering, 2012: p. 1-18.

13. Pérez-Castillo, R., et al., Generating Event Logs from Non-Process-Aware Systems Enabl-

ing Business Process Mining. Enterprise Information System Journal, 2011.5(3): p.301–335.

14. Zou, Y. and M. Hung, An Approach for Extracting Workflows from E-Commerce

Applications, in Proceedings of the Fourteenth International Conference on Program

Comprehension2006, IEEE Computer Society. p. 127-136.

15. Polyvyanyy, A., S. Smirnov, and M. Weske, Business process model abstraction.

Handbook on Business Process Management 1, 2010: p. 149-166.

16. Binkley, D., et al. To camelcase or under_score. 2009. IEEE.

17. Fernández-Ropero, M., et al., Repairing Business Process Models as Retrieved from Source

Code, in BPMDS series, in conjunction with CAiSE’132013: Valencia, Spain. p. InPress.

18. Dijkman, R., et al., Identifying refactoring opportunities in process model repositories.

Information and Software Technology, 2011.

19. Fernández-Ropero, M., et al., Assessing the Best-Order for Business Process Model

Refactoring, in SAC 2013: Coimbra, Portugal. p. 1400-1406.

20. Fernández-Ropero, M., et al., Quality-Driven Business Process Refactoring, in International

Conference on Business Information Systems 2012: Paris, France. p. 960-966.