file, event attributes profile and performance profile.
Then by converting the profiles defined into an aggre-
gate vector the distance between any two traces can
be calculated. One advantage of this technique is that
it provides a full range of metrics for clustering traces.
Context-aware trace clustering methods are pro-
posed in (Bose and van der Aalst, 2010) and (Bose
and van der Aalst, 2009). In (Bose and van der Aalst,
2010) the authors indicate that the feature sets based
on sub-sequences of different lengths are context-
aware for the vector space model and can reveal some
set of common functionality accessed by the process.
Two traces that have a lot of common conserved fea-
tures should be put in the same cluster. In (Bose
and van der Aalst, 2009) the authors presents an edit
distance-based approach for partitioning traces into
clusters such that each cluster consists of traces with
similar structure. The cost of edit operations is asso-
ciated with the contexts of activities so that the calcu-
lated edit distance between traces is more accurate.
In (Weerdt et al., 2013) a novel technique for trace
clustering is presented which is able to directly opti-
mise the accuracy of each cluster’s underlying pro-
cess model. This method doesn’t consider the vector
space model or define a metric for trace clustering, it
simply discovers the suitable traces for each cluster
so that the combined accuracy of the related models
for these clusters is maximized. This method suffi-
ciently resolves the divergence between the clustering
bias and the evaluation bias.
Classification technique is widely used on De-
cision Mining area in business process mining. In
(Rozinat and van der Aalst, 2006) the author devel-
oped a Decision Miner based on Decision Tree algo-
rithm which aims at analysing the choice constructs
of process models by exploiting the event attributes
recorded in event logs.
6 CONCLUSION
In this paper we proposed and elaborated the basic
definition of Multi-label Case Classification. Next,
a concrete systematic method was introduced which
is able to discover all of the label-related structural
features of traces and transform these found features
into case attributes for the later classification job. The
effectiveness and practicability of our technique were
then testified through a case study.
Our next research task will be to focus on exploit-
ing the decision trees generated by our technique so as
to clearly reveal the influences of different categories
on the execution of business processes.
REFERENCES
Bose, R. and van der Aalst, W. (2009). Context Aware Trace
Clustering: Towards Improving Process Mining Re-
sults. In Proceedings of the SIAM International Con-
ference on Data Mining, pages 401–412.
Bose, R. and van der Aalst, W. (2010). Trace Cluster-
ing Based on Conserved Patterns: Towards Achiev-
ing Better Process Models. In Business Process
Management Workshops, volume 43 of Lecture Notes
in Business Information Processing, pages 170–181.
Springer Berlin.
SAP Community. (2014). Customer-defined Case Classifi-
cation. http://help.sap.com.
Greco, G., Guzzo, A., Pontieri, L., and Sacca, D. (2006).
Discovering Expressive Process Models by Cluster-
ing Log Traces. IEEE Transaction on Knowledge and
Data Engineering, 18(8):1010–1027.
Han, J. and Kamber, M. (2000). Data Mining: Concepts
and Techniques. Morgan Kaufmann, 2nd edition.
Kotsiantis, S. (2007). Supervised Machine Learning: A
Review of Classification Techniques. In Proceed-
ings of the 2007 Conference on Emerging Artificial
Intelligence Applications in Computer Engineering:
Real Word AI Systems with Applications in eHealth,
HCI, Information Retrieval and Pervasive Technolo-
gies, pages 3–24. IOS Press.
Quinlan, J. (1993). C4.5: Programs for Machine Learning.
Morgan Kaufmann.
Rozinat, A. and van der Aalst, W. (2006). Decision Min-
ing in Prom. In International Conference on Business
Process Management (BPM 2006), volume 4102 of
Lecture Notes in Computer Science, pages 420–425.
Springer Berlin.
Song, M., Gnther, C., and van der Aalst, W. (2009). Trace
Clustering in Process Mining. In Business Process
Management Workshops, volume 17 of Lecture Notes
in Business Information Processing, pages 109–120.
Springer Berlin.
Tsoumakas, G. and Katakis, I. (2007). Multi-label Classi-
fication: An Overview. International Journal of Data
Warehousing and Mining, 3(3):10–13.
Carvalho, A. and Freitas, A. A. (2009). A Tutorial on Multi-
label Classification Techniques. Foundations of Com-
putational Intelligence, Studies in Computational In-
telligence, pages 117-195. Springer Berlin.
van der Aalst, W. (2011). Process Mining: Discovery, Con-
formance and Enhancement of Business Processes.
Springer Berlin Heidelberg, Berlin, 1nd edition.
Weerdt, J. D., vanden Broucke, S., Vanthienen, J., and Bae-
sens, B. (2013). Active Trace Clustering for Improved
Process Discovery. IEEE Transaction on Knowledge
and Data Engineering, 25(12):2708–2720.
Weijters, A. and Ribeiro, J. (2011). Flexible Heuristics
Miner (FHM). In Proceedings of CIDM, pages 310–
317.
Wang, J. and Han, J. (2004). BIDE: Efficient Mining of Fre-
quent Closed Sequences. In 20th Int. Conf. on Data
Engineering, Boston, MA.
ICEIS2015-17thInternationalConferenceonEnterpriseInformationSystems
258