A comparative evaluation of ontology relationship
learning may be more interesting than absolute
evaluations, as it may expose the differences
between the systems and reveal to what extent they
may be combined in hybrid approaches.
Concept Extraction. The domain chosen for the
evaluation was project management in STATOIL, a
large Norwegian petroleum company. They use a
particular project management methodology, PMI,
that is documented in handbooks and also reflected
in project documentation from their own projects.
Domain experts from STATOIL have together with
ontlogy modelers built a project management
ontology (Gulla et al., 2007), which served as a gold
standard for our concept extraction part.
Our association rules mining system was run on
STATOIL’s documentation of their project
management methodology, PMBOK (PMI, 2000).
This is a book of about 50.600 words (tokens)
divided into 12 chapter.
The system extracted a total of 196 concepts,
compared to the manually constructed ontology’s
142 concepts. 50 concepts were identical in both
sets, whereas some other 61 concepts found were
abstractions of similar concepts in the manual
ontology. If we assume that both the 50 perfect
matches and the 61 abstract matches are valid, we
have a precision of 56.7% and a recall of 78.2% for
the concept extraction part.
Relationship Learning. For the relationship part,
we compared the association rule approach to the
cosine similarity system explained above. The
manual ontology did not contain enough
relationships to be of much use in this part of the
evaluation. We first made a distinction between
three types of relationships found by the two
systems:
♦ Relationships suggested only by the
association rule approach
♦ Relationships suggested only the cosine
similarity approach
♦ Relationships suggested by both approaches
Slightly more than 50% of the relationships found
were also identified by the cosine similarity method.
A selection of concepts were chosen. For each of
the three groups above, all suggested relationships
to/from these concepts were shown to four persons
that all had project management experience. Each
person individually rated each relationship as not
related (these two concepts are not related), related
(there is probably a relationship between the two
concepts) or highly related (there is definitely a
relationship between these two concepts). An
average score for each relationship was calculated
on the basis of the individual scores from the test
persons. Figure 2 shows the related concepts
suggested for the ontology concept Cost for the three
groups, as well as their average scores.
Adding the results for all concepts together, we
can compare the quality of relationships for the three
groups. As shown in Figure 3, association rules and
cosine similarities tend to produce the same share of
good relationships (score Related and Highly
related). The two methods suggested 82% and 86%
good relationships, respectively, which is a fairly
good result for such a small document collection. It
should be noted, though, that this does not mean that
they necessarily suggest the same relationships.
The share of very good relationships is worth a
closer inspection. Whereas the association rules
method only generated 7% very good relationships,
the cosine similarity method reached an impressive
24%.
A possible explanation for this difference lies in
the mechanics of association rules and cosine
similarity. For an association rule to be generated,
the corresponding concepts need to occur is a wide
range of documents. This will typically be the case
for very general concepts and their rather general
relationships. The cosine similarity method, on the
other hand, makes use of tf.idf to characterize
concepts by their differences to other concepts, and
the relationships based on cosine similarities will be
based on these discriminating concept vectors. The
relationships get more specialized and precise and
are easier to recognize as very good relationships.
This may also explain why the association rule
method had a larger share of normally good
relationships (75%) than the cosine similarity
method (65%).
Interestingly, a combination of the two methods
seems to produce much better results that each
individual method. Both methods carry some noise,
but our results indicate that this noise is dramatically
reduced if we only keep the results that are common
to both methods. In total, 97% of the relationships
suggested by both methods were rated as good
relationships by the test group (right column in
Figure 3). 30% were considered very good
relationships. This suggests that the two approaches
– although comparable in quality – are
fundamentally different with their own weaknesses
and strenghts. Since overgeneration is already a
problem in relationship learning, a better approach
ICEIS 2008 - International Conference on Enterprise Information Systems
62