Extensions, Analysis and Experimental Assessment of a Probabilistic Ensemble-learning Framework for Detecting Deviances in Business Process Instances
Alfredo Cuzzocrea, Francesco Folino, Massimo Guarascio, Luigi Pontieri
2017
Abstract
This paper significantly extends a previous proposal where an innovative ensemble-learning framework for mining business process deviances that exploits multi-view learning has been provided. Here, we introduce some relevant contributions: (i) a further learning method that extends and refines the previous methods via introducing the idea of probabilistically combining different deviance detection models (DDMs); (ii) a complete conceptual architecture that implements the extended multi-view ensemble-learning framework; (iii) a wide and comprehensive experimental assessment of the framework, even in comparison with existent competitors. The investigated scientific context falls in the so-called Business Process Intelligence (BPI) research area, which is relevant for a wide number of real-life applications. These novel contributions clearly confirm the flexibility, the reliability and the effectiveness of the general deviance detection framework, respectively.
References
- Bose, R.P.J.C., van der Aalst, W.M.P.: Discovering signature patterns from event logs. In: IEEE Symp. on Computational Intelligence and Data Mining (CIDM'13), pp. 111-118 (2013)
- Bradley, A.P.: The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognition 30(7), 1145-1159 (1997)
- Buckland, M., Gey, F.: The relationship between recall and precision. Journal of the American Society for Information Science 45(1), pp. 12-19 (1994)
- Cortes, C., Vapnik, V.: Support-vector networks. Machine learning 20(3), pp. 273-297 (1995)
- Cuzzocrea, A.: Accuracy Control in Compressed Multidimensional Data Cubes for Quality of Answerbased OLAP Tools. In: Proc. of IEEE SSDBM 2006, pp. 301-310, 2006
- Cuzzocrea, A., Folino, F., Guarascio, M., Pontieri, L.: A Multi-view Learning Approach to the Discovery of Deviant Process Instances. In: Proc. of CoopIS 2015, pp. 146-165, 2015
- Cuzzocrea, A., Furfaro, F., SaccĂ , D.: Enabling OLAP in mobile environments via intelligent data cube compression techniques. Journal of Intelligent Information Systems (33)(2), pp. 95-143 (2009)
- Cuzzocrea, A., Matrangolo, U.: Analytical Synopses for Approximate Query Answering in OLAP Environments. In: Proc. of DEXA 2004, pp. 359-370, 2004
- van Dongen, B.: http://dx.doi.org/10.4121/ uuid:d9769f3d-0ab0-4fb8-803b-0d1120ffcf54
- van Dongen et al.: The ProM framework: A new era in process mining tool support. In: Proc. of 26th 10th Int. Conf. on Applications and Theory of Petri Nets (ICATPN'05), pp. 444-454 (2005)
- Frank, E., Hall, M.A., Holmes, G., Kirkby, R., Pfahringer, B.: Weka - a machine learning workbench for data mining. In: The Data Mining and Knowledge Discovery Handbook, pp. 1305-1314 (2005)
- Japkowicz, N., Stephen, S.: The class imbalance problem: A systematic study. Intelligent Data Analysis 6(5), pp. 429-449 (2002)
- Kubat, M., Holte, R., Matwin, S.: Learning when negative examples abound. In: Proc. of 9th Europ. Conf. on Machine Learning (ECML'97), pp. 146-153 (1997)
- Langley P., Iba W., Thompson K.: An analysis of Bayesian classifiers. In: Proc. of 10th Nat. Conf. on Artificial intelligence (AAAI'92), pp. 223-228 (1992)
- Lo, D., Cheng, H., Han, J., Khoo, S.C., Sun, C.: Classification of software behaviors for failure detection: A discriminative pattern mining approach. In: Proc. of 15th Int. Conf. on Knowledge Discovery and Data Mining (KDD'09), pp. 557-566 (2009)
- Nguyen, H., Dumas, M., Rosa, M.L., Maggi, F.M., Suriadi, S.: Mining business process deviance: A quest for accuracy. In: Proc. of 2014 Int. Conf. On the Move to Meaningful Internet Systems (OTM'14), pp. 436-445 (2014)
- Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1993)
- Folino, F., Guarascio, M., Pontieri, L.: Mining predictive process models out of low-level multidimensional logs. In: Proc. of 26th Int. Conf. on Advanced Information Systems Engineering (CAISE'14), pp. 533- 547 (2014)
- Blum A. and Mitchell T.: Combining labeled and unlabeled data with co-training. In: Proc. of the 11th Conf. on Computational Learning Theory (COLT'98), pp. 92- 100 (1998)
- Nigam K., Ghani R.: Analyzing the effectiveness and applicability of co-training. In: Proc. of the 9th Int. Conf. on Information and Knowledge Management (CIKM'00), pp. 86-93 (2000)
- Wang W., Zhou Z.H.: A new analysis of co-training. In: Proc. of the 27th Int. Conf. on Machine Learning (ICML'10), pages 1135-1142, 2010.
- Domingos P., Pazzani M.J.: Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier. In: Proc. 13th Int. Conf. on Machine Learning (ICML'96). pp.105-112 (1996)
- Domingos P., Pazzani M.J.: On the Optimality of the Simple Bayesian Classifier under Zero-One Loss. Machine Learning 29, pp.103-130 (1997)
- Keogh E.J., Pazzani M.J.: Learning the Structure of Augmented Bayesian Classifiers. Int. J. Artificial Intelligence Tools, 11(40), pp. 587-601 (2002)
- Ying Y. et al.: To Select or To Weigh: A Comparative Study of Linear Combination Schemes for SuperParentOne-Dependence Estimators. IEEE Transactions on Knowledge and Data Engineering, 19(12), pp.1652- 1665 (2007)
- Bose, R.P.J.C., van der Aalst, W.M.P.: Trace clustering based on conserved patterns: Towards achieving better process models. In: Proc. of Business Process Management Workshops (BPI'10), vol. 43, pp. 170- 181 (2010)
- Sahami M.: Learning Limited Dependence Bayesian Classifiers. In: Proc. 2nd ACM SIGKDD of Int. Conf. Knowledge Discovery and Data Mining (KDD'96), pp. 334-338 (1996)
- Suriadi S., Chun O., van der Aalst W.M.P., ter Hofstede A.H.M. : Root Cause Analysis with Enriched Process Logs. In: Business Process Management Workshops 2012, pages 174-186, 2013.
- Swinnen, J., Depaire, B., Jans, M.J., Vanhoof, K.: A process deviation analysis - A case study. In: Proc. of 2011 Business Process Management Workshops, pp. 87-98 (2011)
- Webb G.I., Boughton J., Wang Z. Not So Naive Bayes: Aggregating One-Dependence Estimators. Machine Learning, 58(1), pp. 5-24 (2005)
- Zhang, G.P.: Neural networks for classification: a survey. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on 30(4), pp. 451- 462 (2000)
- Zhang, H., Jiang, L., Su, J.: Hidden naive bayes. In: Proc of AAAI, pp. 919-924 (2005)
- Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems). Morgan Kaufmann Publishers Inc. (2005)
Paper Citation
in Harvard Style
Cuzzocrea A., Folino F., Guarascio M. and Pontieri L. (2017). Extensions, Analysis and Experimental Assessment of a Probabilistic Ensemble-learning Framework for Detecting Deviances in Business Process Instances . In Proceedings of the 19th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-247-9, pages 162-173. DOI: 10.5220/0006340001620173
in Bibtex Style
@conference{iceis17,
author={Alfredo Cuzzocrea and Francesco Folino and Massimo Guarascio and Luigi Pontieri},
title={Extensions, Analysis and Experimental Assessment of a Probabilistic Ensemble-learning Framework for Detecting Deviances in Business Process Instances},
booktitle={Proceedings of the 19th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2017},
pages={162-173},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006340001620173},
isbn={978-989-758-247-9},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 19th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - Extensions, Analysis and Experimental Assessment of a Probabilistic Ensemble-learning Framework for Detecting Deviances in Business Process Instances
SN - 978-989-758-247-9
AU - Cuzzocrea A.
AU - Folino F.
AU - Guarascio M.
AU - Pontieri L.
PY - 2017
SP - 162
EP - 173
DO - 10.5220/0006340001620173