The format of smaller glossaries includes the
lemmatised form of the term and the semantic and
morphological trait associated with it :
marque : noun += [!insc:+].
cachet : noun += [!insc:+].
2.2.2 Resolution of Ambiguities
The identification of words or phrases is not the only
difficulty faced by a system of information
extraction. In the context-rich environment of
cultural heritage artefact descriptions, the
complexity of the language itself and the multiplicity
of meanings that can be given to the descriptors
used, one of the major problems is the resolution of
semantic ambiguity. A word or phrase can be used
in different contexts both to describe the
characteristics of an artefact as well as the artefact
itself, for example a picture of a chalice, the name
of a person can be that of a person represented, or
that of the artist (...). Often, heritage objects that are
being described are part of a whole. The description
of this type of object can refer to included elements,
or to its container. It is therefore in a situation where
several artefact names are mentioned. How do we
know which is the subject of study ?
In the sentence: Calice en argent doré, orné de
grappes de raisins, d'épis de blé, de roseaux sur le
pied et la fausse coupe, d'une croix et des
instruments de la passion dans des médaillons, sur
le pied.
The terms: calice, croix, instruments,
médaillons exist in the lexicon DENOMINATION.
The term calice also exists in the lexicon
REPRESENTATION
How can we be sure that, in this case, it is
DENOMINATION?
How to choose the term for the
DENOMINATION?
Study of the initial position
The study of the ordering of descriptors in a text
provides valuable assistance, particularly for solving
certain types of ambiguities. The study of the initial
position, based on cognitive considerations (Enkvist,
1976), (Ho-Dac, 2007), gives special importance to
the beginnings of sentences: the information at the
beginning is a given information or at least one that
is important.
In this perspective, extracting information from
the following text:
Calice en argent doré, orné de grappes de
raisins, d'épis de blé, de roseaux sur le pied et la
fausse coupe, d’une croix et des instruments de la
passion dans des médaillons, sur le pied.
Will give a preference to the descriptor Calice
compared to other descriptors mentioned above, to
designate the name of the object studied.
Local context
Resolving ambiguities requires an analysis and
understanding of local context. A morphosyntactic
analysis of words surrounding the word whose
meaning we seek to identify, as well as searching for
linguistic clues in the context of a theme, can resolve
some ambiguities.
In the sentence : C’est une peinture à l’huile de
très grande qualité, panneau sur bois représentant
deux figures à mi corps sur fond de paysage, Saint
Guilhem et Sainte Apolline, peintures enchâssées
sous des architectures à décor polylobés; Saint
Guilhem est représenté en abbé bénédictin (alors
qu’à sa mort en 812 il n’était que simple moine);
Sainte Apolline tient l’instrument de son martyre,
une longue tenaille.
Saint Guilhem can designate a place or a person.
Is it a painting that is located in Saint Guilhem, or
does it represent Saint Guilhem and Sainte Apolline?
A study of the position and the semantic class of
arguments in the relationship:
subject-verb-object,
provides clues for resolving this ambiguity, the
principle that the topic is the subject of the sentence,
what is known as the word about the phrase, what is
said of the theme.
In the above example the verb representing
contains the feature [Repr: +], which links it with the
REPRESENTATION class. In the absence of other
significant indices, it can thus be inferred that the
purpose of the sentence is “representation” and Saint
Guilhem and Sainte Apolline do not designate
places, but rather the representation.
2.3 Semi-automatic Generation of a
Domain Ontology
The knowledge gathered on an artefact is necessarily
partial: it is only valid for a period of time and
therefore cannot be limited to a descriptive grid
designed for one specific application.
Knowledge is scalable, cultural heritage artefacts
have a past, a present and perhaps a future; they
undergo transformations over time.
However, we have seen above that the extraction
of information in our case must correspond to
precise specifications. We are thus faced with two
requirements: on the one hand to populate a database
defined by a specific inventory description system,
KMIS 2010 - International Conference on Knowledge Management and Information Sharing
242