to obtain a concise representation of the available
data in the reduced dataset such as org:resource,
lifecycle:transition, and time:timestamp. Fea-
ture selection also permits to drop irrelevant
data such as case:LoanGoal, MonthlyCost, and
case:RequestedAmount that could slowdown our
system or affect the inaccuracy of the results.
• Feature creation generated new features that aim
at capturing the most important data in a dataset
much more efficiently than the original dataset.
We used 2 techniques, feature extraction and fea-
ture construction. In the former technique, we ex-
tracted the resource’s availability time, R
k
[b, e],
from the time:timestamp feature in the dataset.
For each resource R
k
, b and e refer, respectively,
to the lowest and highest values per date of the
time part in the time:timestamp feature. In the lat-
ter technique, we performed a deep analysis of the
reduced dataset to define the consumption-time
interval of the resource, T
R
k
i
[x
con
i j
, y
con
i j
]. For
each task, we computed the time between its in-
stances and their successors. The average time is
then considered as the consumption-time interval
for that task.
Once a resource’s availability-time interval and
consumption-time interval are defined, we worked
on task/resource time-connection as per Sec-
tion 4.3. In addition, the analysis of the dataset
and several papers on the BPI challenge 2017 al-
lowed us to work on the consumption property of
each resource and transactional property of each
task. For instance, we noticed that the execution
of some tasks was suspended when the resource
availability-time ended. Hence, we assigned lim-
ited property to this resource. As for the trans-
actional property, we associated compensatable
property with make offer task since 3% of offers
were canceled or refused after their successful ex-
ecution according to (Bolt, 2017). Finally, we
added the consumption and transactional proper-
ties to the preprocessed dataset.
5.2 Task/Resource Recommendations
To enact task/resource coordination, we resorted
to process mining that is known for addressing
BP decision-making problems and providing recom-
mendations to improve future BP executions. In this
context, a good number of recommendations tech-
niques are reported in the literature. We opted for De-
cision Trees (DT)
4
known for easiness, performance,
and ability to visually communicate choices. We cre-
4
scikit-learn.org/stable/modules/tree.html.
ated a prediction model following 2 stages, offline and
online, allowing to build and process the required pre-
diction model.
Offline Stage. We built a prediction model by refer-
ring to an open-source Python library called Sklearn
5
.
It supports a variety of built-in prediction models
(e.g., DT, KNN, and SVM). Building a prediction
model with Sklearn usually starts with preparing the
dataset in the most suitable format as per Section 5.1
that is Pandas DataFrame in our case. Then, the target
of the prediction model must be defined. This latter is
about the recommendation of the actions to take when
a resource is assigned to a task (e.g., adjusting the
resource-availability time or consumption-time inter-
val). Finally, the set of variables that may affect the
recommendations such as resource’s availability-time
and consumption-time are defined.
In terms of technical details, we, first, selected
the DT model and then, set its parameters namely,
attribute selection criterion (Entropy or Gini index)
and maximum depth of the tree. These parameters
are critical to the accuracy of the results and are usu-
ally set manually after several trials to find the best
results. Afterwards, we fitted the DT model into the
specified data. We had to split the reduced dataset
into training dataset and test dataset using Sklearn’s
TRAIN TEST SPLIT
6
built-in function. Finally, we
evaluated the accuracy of our model by comparing the
result of the prediction to the real data included in the
test dataset as per Section5.3.
Online Stage. The objective here is to recommend
actions to take when a resource is assigned to a task in
the new executions. Obviously, recommendations are
determined using the prediction model built during
the offline stage. We carried out some experiments to
predict the actions to perform for each task-resource
with respect to some new simulated BP instances. We
checked how accurate our prediction model is. Dur-
ing the experiments, we considered 30 simulated in-
stances of the BP. The accuracy of results are dis-
cussed next.
5.3 Discussions of the Results
To appreciate our DT-based prediction model’s rec-
ommendations, we also adopted the k-Nearest Neigh-
bors (KNN) as another technique for developing pre-
diction models. As for the DT prediction model, the
experiments showed encouraging results during either
the offline stage or the online stage. In this context,
5
scikit-learn.org/stable/index.html.
6
scikit-learn.org/stable/modules/generated/sklearn.
model selection.train test split.html.
On the Use of Allen’s Interval Algebra in the Coordination of Resource Consumption by Transactional Business Processes
23