Let us suppose that we have data, which gives
details of three sequences that the user is known to
demonstrate: -
1. Putting in preparation time before an
administration meeting.
2. Putting in preparation time before a project
meeting or presentation.
3. Putting in travel time before paying a visit
to another company.
Examples of the sequences take the form of a pair
of tasks joined via the ‘sequence’ relationship.
All of the sequences(plus any others within the
data set) are collected as a single set of examples,
which must be split into separate clusters so that the
learning of each sequence can take place separately.
We can perform the initial splitting of the data by
using a bottom-up agglomerative clustering
algorithm over the first task of each pair to produce
a group of subsets, and then using the clustering
algorithm again on each subset on the second task of
each pair to produce the final clusters of examples,
which will be used.
Each set is used as the example set for a series of
different learning problems, each problem focuses
on a different attribute within one or other of the
tasks and attempts to find any specialisations which
may be regarded as helping to characterize the
particular sequence under examination. For example,
the first learning problem would focus on the type of
the first task of the pair in each example and would
attempt to find any regularities amongst all the
examples of the set for that particular attribute.
Subsequent learning problems would focus on the
subtype, sub-subtype, and duration of the first task
individually, and then a further set of learning
problems would focus on the individual attributes of
the second task in the same manner.
Each learning problem requires positive
examples, negative examples and background
knowledge. The positive examples are the examples
contained within the set that is currently under
examination. The background knowledge used for
each learning task is a subset of the entire set of
background knowledge available, only those items
of knowledge which directly refer to the attribute
under examination are presented to the learner for
each problem. It is this splitting of the available
background knowledge into subsets in conjunction
with the splitting of the overall learning problem
into separate smaller problems (i.e. where the length
of the clauses required is much smaller) which
enables the learner to be able to tackle the overall
problem of learning a user model as it reduces the
number of possible hypotheses to be considered to a
level which is manageable by the ILP engine.
A set of automatically generated ‘negative
examples’ is produced for the attribute currently
under examination. These are examples of pairs of
tasks that the user would never produce and hence
should not be thought of as being dependent on each
other. Each set of negative examples only differs
from the original data supplied by the user by a
small amount, and all of the negative examples
within a set differ from the original data in such a
way that the ILP engine can use part of the provided
background knowledge to successfully exclude all
the negative examples from the solution that it
produces.
If we were to generate a set of negative examples
for the ‘type’ attribute then we would take a user-
generated (positive) example:-
sequence:<
travel/london,2hrs,10-00, thursday>,
<visit/ericsson, 2hrs, 13-00, thursday>
And alter one of the values, producing:-
sequence:<
admin/london,2hrs,10-00, thursday>,
<visit/ericsson, 2hrs, 13-00, thursday>
This is repeated several times, using all of the
examples within the set to generate negative
examples. Values to be substituted into the attribute
to be altered must satisfy the criterion that they must
place the new example far enough away from the
original example (using the distance measure used to
produce the original clusters) that it could not be
considered as part of the cluster of original
examples. As it is likely that the amount of data with
which we will be working will not be very large, the
concepts being learnt may not be accurately
characterised by the examples collected. This
criterion allows a little more ‘space’ between the
positive and negative examples and hence allows the
learner to produce a rule which does not adhere so
tightly to the exact details of the examples collected,
hence a more general overall theory is produced
which should provide better results when asked for
predictions.
Generating of values for substitution where the
variables being examined contain real values (for
example the duration of a task) presents a further
complication. In order to ensure that the values
returned for a task prediction are accurate, the range
of values that the variable is capable of being
instantiated to must be limited. Values which are
unacceptable as predicted values can be used to
generate negative examples, but there may be cases
where individual examples within the same cluster
have values for a particular attribute which would be
unsuitable if used within other examples in the same
ICEIS 2005 - ARTIFICIAL INTELLIGENCE AND DECISION SUPPORT SYSTEMS
328