ing the edges of the path between both classes. This
path is a combination among hierarchical links, the
domain and range of object properties. They tested
their approach in the extraction of annotations from
the Unified Medical Language System (UMLS) ob-
taining an highest average precision. In our case,
we can apply our algorithm in any open discourse
with generic ontologies. Castano et al. (Castano
et al., 2003) presented an ontology matching algo-
rithm where the semantic affinity between two con-
cepts is evaluated in function of their relationships in
the thesaurus and in their contexts.
Most of WSD approaches are evaluated applying
a standard data set benchmark, called Senseval (Kil-
garriff, 1998). The second version of Senseval con-
tains a set of corpus, lemmas and instances where au-
thors can test the robustness and precision of theirs
algorithms. In our case, we can not apply this bench-
mark since our data source have to be represented
with OWL language.
3 DISAMBIGUATION CONCEPT
PROCESS
Our idea behind this algorithm is based on the ca-
pacity of finding correspondences between elements
of both structures, and later the weighting each cor-
respondence according a vote system that combines
following information giving an assessment: lexical
function among elements, semantic function of the
correspondence, and the ambiguity of both elements.
Thus, we have defined a category of elements that we
can find in WordNet and the ontology and also, the
voting system of each correspondence. Both tasks are
complemented establishing the context that it reduces
the search space and increases the accuracy and com-
putational cost effectiveness.
3.1 Elements
In an OWL ontology, we manage only concepts,
but the rest of elements (properties, axioms, and in-
stances) have an influence on voting system. An OWL
concept either can be simple like: dog and hotdog
or can be composed of two terms like: WarmTem-
perature and PizzaTopping. For that reason, a con-
cept has one o more possible equivalences in Word-
Net (iff it is right spelling). A tokenization process
which is based on capitalisation letters (i.e. PizzaTop-
ping, hotAir but not hotDog) and special characters
(i.e. hot dog, hot#Temperature) splits the words of a
concept. If the concept has two words then it will be
manage like two different words in the algorithm. All
user definitions of ontology elements go through a to-
kenization (concepts, individuals and properties) and
stemming process (properties), both tasks increase the
flexibility degree of our algorithm.
Anyway, a simple or composed concept is trans-
formed in a word with meaning in WordNet. Each
word in WordNet has a set of senses. Each
sense has a set of synonyms/antonyms, hyper-
onym/hyponyms, meronyms/holonyms, and one defi-
nition (gloss) which adds more terms. Glosses’ terms
are previously filtered and stemmed since we avoid
unnecessary analysis (i.e. articles, prepositions) and
loss by noun coincidences (i.e. plurals, verb forms,
etc.), respectively.
3.2 Nomenclature
To ease the reading, a concept from ontology is de-
fined by C, each concept in WordNet has a set of
simple words: C ≡ sW
i
. Each simple word has a
set of senses: wS
i
: S
i
. Hereafter, i-indices are inde-
pendent among elements. Each sense has a group of
gloss’ terms: wS
i
: S
i
: t
i
, that is equivalent to t
i
: S
i
:
wS
i
. Also, each sense has a set of semantic WordNet
constructors: S
i
: H pon
i
, H per
i
, Syn
i
, Ant
i
, Mer
i
, Hol
i
.
Thus, we could say that “:” means “has”, and i-index
means “element of set”. Each, H pon
i
, H per
i
, etc. can
be considerer like a sW
k
, being a recursive representa-
tion.
3.3 Correspondences
Elements can generate a correspondence when its lex-
ical coincide with other noun. We save this corre-
spondence: both elements and their semantic func-
tion. The function of each term is provided by the
semantic function in WordNet (syn., hyper., etc.) and
in the ontology (superclass, subclass, equivalent and
disjoint classes, and properties).
Only there will be a correspondence when we can
obtain at least one sense from both concepts. For ex-
ample, a matching between simpleWords sWi = sW k
is not useful for our case. Instead, in a term relation
sW
i
: S
i
: t
i
=t
j
: S
j
: sW
j
both senses are present. Also,
this case sW
i
: S
i
: Syn
i
= sW
j
is valid. Therefore, there
are 8x8 possible combinations for each two simple
words.
3.4 Voting System
Each correspondence is a vote and each vote has a dif-
ferent weight according to the semantic of elements
involved and the ambiguity of their senses. For ex-
ample, a concept (C
i
≡ sW
i
) has a direct or indirect
UNSUPERVISED ALGORITHM FOR THE CONCEPT DISAMBIGUATION IN ONTOLOGIES - Semantic Rules and
Voting System to Determine Suitable Senses
389