ment. Let T be the universe of all (possibly partial)
traces that may appear in any log of the process un-
der analysis. For any trace τ ∈ T , len(τ) is the num-
ber of events in τ, while τ[i] is the i-th event of τ, for
i = 1 .. len(τ), with task(τ[i]) and time(τ[i]) denoting
the task and timestamp of τ[i], respectively. We also
assume that the first event of each trace is always as-
sociated with a unique “initial” task (possibly added
artificially), and its timestamp registers the time when
the corresponding process instance started.
For any trace τ, let context(τ) be a tuple gathering a
series of data about the executioncontext of τ, ranging
from intrinsic data properties to environmental vari-
able characterizing the state of the BPM system when
τ was enacted.
For ease of notation, let A
T
denote the set of all the
tasks (a.k.a., activities) that may occur in some trace
of T , and context(T ) be the space of context vec-
tors — i.e., A
T
= ∪
τ∈T
tasks(τ), and context(T ) =
{context(τ) | τ ∈ T }.
Finally, a log L is a finite subset of T .
Workflow Schemas and Behavioral Profiles. Var-
ious languages have been proposed in the literature
for specifying the behavior of a business process, in
terms of its composing activities and their mutual de-
pendencies— such as Petri nets (van der Aalst, 1998),
causal nets (van Der Aalst et al., 2011), and heuris-
tics nets (Weijters and van der Aalst, 2003). For the
sake of concreteness we next focus on the language of
heuristics nets (Weijters and Ribeiro, 2011), where a
workflow schema is a directed graph where each node
represents a process activity, each edge (x,y) encodes
a dependency of y on x, while each fork (resp., join)
node can be associated with cardinality constraints
over the the edges exiting from (resp., entering) it.
The behavior modeled by a workflow schema can
be captured approximately by way of simple pair-
wise relationships between the activities featuring in
it, named (causal) behavioral profiles (Weidlich et al.,
2011), which can be computed efficiently for many
classes of models.
Let W be a workflow schema, and A(W) be its as-
sociated activities. Let ≻
W
be a “weak order” relation
inferred from W, such that, for any x,y ∈ A(W), it is
y ≻
W
x iff there is at least a trace admitted byW where
y occurs after x. Then the behavioral profile matrix
of W, denoted by B(W), is a function mapping each
pair (x,y) ∈ A(W) × A(W) to an ordering relation in
{ ,+, k}, as follows: (i) B(W)[x, y] = , iff y≻
W
x
and x⊁
W
y (strict order); (ii) B(W)[x,y] = +, iff
x⊁
W
y and y⊁
W
x (exclusiveness); (iii) B(W)[x,y] =k,
iff x≻
W
y and y≻
W
x (either interleaving or loop).
Let τ be a trace, over trace universe T , x and y be
two activities in A
T
, and B be a behavioral profile
matrix. Then we say that τ violates (resp., satisfies)
B[x, y], denoted by τ 6⊢ B[x,y] (resp., τ ⊢ B[x,y]), if
the occurrences of x and y in τ infringe (resp., fullfill)
the ordering constraint stated in B[x, y]. More specifi-
cally, it is τ 6⊢ B[x, y] iff there exist i, j ∈ {1,..., len(τ)}
such that τ[i] = y, τ[ j] = x, and either (i) B[x,y] = +,
or (ii) B[x, y] = and i < j.
Conceptual Clustering Models. The core assump-
tion under our work is that the behavior of a process
depends on context factors. Hence, in order to predict
the structure of any trace τ, we regard its associated
context properties context(τ) as descriptive attributes.
For the sake of interpretability, we seek a concep-
tual clustering model encoded in terms of decision
rules over context(T ). Let us define conceptual clus-
tering rule over a trace universe T as a disjunction
of conjunctive boolean formulas, of the form [(A
1
1
∈
V
1
1
) ∧ (A
1
2
∈ V
1
2
) ∧ ... ∧ (A
1
k
1
∈ V
1
k
1
)] ∨ [(A
2
1
∈ V
2
1
) ∧
... ∧(A
2
k
2
∈ V
2
k
2
)]∨. ..∨ [(A
n
1
∈ V
n
1
)∧. ..∧(A
n
k
n
∈ V
n
k
n
)],
where n, k
1
,.. . ,k
n
∈ N, and, for each i ∈ {1,. .., n}
and j ∈ { 1,..., k
i
}, A
i
j
is a descriptive attribute de-
fined on T ’s instances (i.e. one of the dimensions of
the space context(T )), and V
i
j
is a subset of the do-
main of attribute A
i
j
.
For any L ⊆ T and for any such a rule r, let
cov(r,L) be the set of all L’s traces that satisfy r.
A conceptualclustering for L is a pair C = hCS,R i,
such that CS = {c
1
,...,c
n
} is a partition of L into n
clusters (for some n ∈ N), and R is a function map-
ping the clusters inCS to mutually-exclusiverules like
those above —note that
S
n
i=1
c
i
= L and
T
n
i=1
c
i
=
/
0),
while cov(R (c),L) = c for any c in CS. Due to their
generality, such rules can split any subset L
′
of Z into
n clusters — {cov(R (c
1
),L
′
),.. .,cov(R (c
n
),L
′
)} is
indeed a partition of L
′
.
In this work, we propose to discover a special
kind of conceptual clustering model for a given set
of log traces, by resorting to a predictive cluster-
ing approach (Blockeel and Raedt, 1998). In gen-
eral, in a predictive clustering setting, two kinds of
attributes are assumed to be available for each el-
ement z of a given space Z = X × Y of instances:
descriptive attributes and target ones, denoted by
descr(z) ∈ X and targ(z) ∈ Y, respectively. Hence,
the goal is to find a partitioning function (similar to
the conceptual clustering models above) that min-
imizes
∑
C
i
|C
i
| ×Var({targ(z) |z ∈ C
i
}), where vari-
able C
i
ranges over current clusters, and Var(S) is the
variance of set S. In our setting, the context data as-
sociated with each trace will be used as its descriptive
features, whereas some basic behavioral patterns ex-
OntheDiscoveryofExplainableandAccurateBehavioralModelsforComplexLowly-structuredBusinessProcesses
209