Table 8: Performance analysis for time remaining predic-
tion.
Method MSE Hours
2
MAE Hours
Execution Time
(in seconds)
Average 98.498,91 171,34 7
K-NN 96.481,01 166,89 79
Gradient Boosting 85.460,92 159,48 381
last, on average, 184,8 hours. The median duration of
a case is 70,2 hours. This means that the predictions
are bad given the median and average duration, but it
is important to take into consideration how much the
duration of a case can vary and the number of activ-
ities of a case. Usually, a short case is composed of
one activity, while longer cases can be composed of
3 or more activities. The short ones, easy to predict,
go just through one state; while the others go through
more states and are harder to predict. This means that
the long cases, which go through more states, are pre-
dicted more times, and they are more likely to be miss
predicted (since they are so far from normal values).
This leads to the ”high” errors shown above. There
are over 10.000 cases that last 30 days or more, and
they appear, on average, in 3 states. These cases will
have more weight than short cases on the MSE and
MAE shown above.
6 CONCLUSIONS AND FUTURE
WORK
In this paper we presented an approach to have in-
sights about a business process. The approach is
mainly divided into three parts: 1) a transition system;
2) a final status prediction; and 3) a time remaining
prediction. For that, Process Mining and Data Mining
techniques were used. Several data mining algorithms
were built over the different states of a transition sys-
tem. This way, the path that a case followed was taken
into account.
To validate our approach, we used a case study
from a financial department with real life data. The
transition system allowed to take into account the path
that a case followed by just using the cases that are
in the same state to train the data mining model (for
status and time remaining prediction).
The case study used to test classification al-
gorithms had two main problems: 1) imbalanced
dataset, with more than 90% of the cases being ap-
proved, and 2) some cases have similar attributes and
similar paths, but they have different final status. For
this scenario, as explained in Section 5, to predict the
final status of a case Random Forest was the algorithm
with the best performance.
Regarding the time remaining prediction, interest-
ing results were achieved with different algorithms.
The poor performance of the linear models proved
that there is no linear relation between the attributes
and the time until completion. The best results were
achieved with Gradient Boosting.
REFERENCES
Aalst, van der, W., Schonenberg, M., and Song, M. (2011).
Time prediction based on process mining. Information
Systems, 36(2):450–475.
Antonio, L., Seabra, R. M., Biagio, P., Ricardo, R., and
Christian, C. (2017). A comparison of advanced
regression techniques for predicting ship co2 emis-
sions. Quality and Reliability Engineering Interna-
tional, 33(6):1281–1292.
Batista, G. E. A. P. A., Bazzan, A. L. C., and Monard, M. C.
(2003). Balancing training data for automated anno-
tation of keywords: a case study. In WOB.
Ruopp, M. D., Perkins, N. J., Whitcomb, B. W., and Schis-
terman, E. F. (2008). Youden index and optimal cut-
point estimated from observations affected by a lower
limit of detection. Biometrical journal. Biometrische
Zeitschrift, 50 3:419–30.
Sharma, S., Agrawal, J., and Sharma, S. (2013). Article:
Classification through machine learning technique:
C4.5 algorithm based on various entropies. Interna-
tional Journal of Computer Applications, 82(16):28–
32. Full text available.
van der Aalst, W. M. P., Rubin, V., Verbeek, H. M. W.,
van Dongen, B. F., Kindler, E., and G
¨
unther, C. W.
(2008). Process mining: a two-step approach to bal-
ance between underfitting and overfitting. Software &
Systems Modeling, 9(1):87.
van Dongen, B. F., Crooy, R. A., and van der Aalst, W.
M. P. (2008). Cycle time prediction: When will this
case finally be finished? In Meersman, R. and Tari,
Z., editors, On the Move to Meaningful Internet Sys-
tems: OTM 2008, pages 319–336, Berlin, Heidelberg.
Springer Berlin Heidelberg.
Vasudev, R. (2017). What is one hot encod-
ing? why and when do you have to use it?
https://hackernoon.com/what-is-one-hot-encoding-
why-and-when-do-you-have-to-use-it-e3c6186d008f.
Accessed: [11/07/2018].
Zeng, S., Melville, P., A. Lang, C., M. Boier-Martin, I.,
and Murphy, C. (2008). Using predictive analysis to
improve invoice-to-cash collection. pages 1043–1050.
MODELSWARD 2019 - 7th International Conference on Model-Driven Engineering and Software Development
480