pute of relevance score of multimedia element must
be based on the information made textual and struc-
tural of other nodes XML neighboring (Hliaoutakis
et al., 2006).
Several works with deal XML document as a flat
source of information and ignore the structure of
XML documents. In this context, (Schlieder and Hol-
ger, 2002) say: ”Ignore the document structure is to
ignore its semantics”. Indeed, XML document is used
to describe a set of data by a structure that provides a
semantic lexicon. Thus, it facilitates the presentation
of information in terms of interpretation and exploita-
tion. Replying to this need, new works appear in the
field of multimedia retrieval that takes into account
the structure as source of relevant information.
Existing work in structured retrieval of multime-
dia elements is decomposed in two classes. The
first class includes some works which proceed to
adopt some traditional technical of retrieval informa-
tion as language model. In this context, the team
CWI/UTwente performs a step of filtering results to
keep the fragments containing at least one multime-
dia element (Westerveld et al., 2007)(Tsikrika et al.,
2008).
The second class includes the specific work to be
structured multimedia retrieval. This class uses the
structure as a source of evidence in the process of se-
lection of multimedia elements. As first step, (Kong
and Lalmas, 2005) proposed a method which com-
bines structure of XML document (XPath) with the
use of links (XLink). This method is to divide XML
document into regions. Each region represents an area
of ancestors of the multimedia element. Its score is
calculated in function of the scores of each region.
This method exploits vertical structure only. In a sec-
ond time, (Torjmen et al., 2010) have used the ad-
dition of horizontal structure to the notion of hier-
archy. (Torjmen et al., 2010) use a method called
”CBA” (Children, Brothers, Ancestors), which takes
into consideration the information carried by the chil-
dren , brothers and fathers nodes for calculate the rel-
evance of multimedia elements. The authors propose
an alternative method ”OntologyLike” which is based
on the identification of XML document to ontology.
To calculate the similarity between nodes the authors
use similarity measures (Rada et al., 1989)(Hirst and
St-Onge, 1997)(Wuand Palmer, 1994) that are mainly
based on the number of edges to calculate the distance
between nodes.
There are other approaches to multimedia retrieval
based on exploiting the links in XML document
(Awadi and Torjmen, 2010). This work was improved
by proposing a hybrid approach that combines struc-
ture with using of links that are considered as seman-
tic links (Aouadi et al., 2012). This method above to
divide the document into regions according the hier-
archical structure and the location of image in docu-
ment. This factor plays a role in the weighting of links
for compute the score of image.
In this paper, we propose a new metric for mul-
timedia retrieval in XML documents which involves
the use of geometric distances to calculate the rele-
vance of each node from the multimedia node. This
method consists of placing the nodes of XML doc-
ument in Euclidean space and define each node by
a vector of coordinates to calculate then the distance
between each pair of nodes. This distance will play
a beneficial role to calculate the score of multimedia
element.
3 PROPOSED APPROACH
The structure of XML document, which is composed
by a root, a set of nodes with elements and attributes,
influences the relevance of an XML fragment. The
notion of structure is also to identify and describe the
various components of textual and non-textual, which
is structured document. Relevant elements retrieval
can then be based on these elements rather than the
element itself. In this direction, we focused on prox-
imity, kinship and nesting relations by defining a set
of geometric distances to best represent multimedia
elements according to their vicinity. Determining the
degree of contribution of each text node in the calcula-
tion of relevance of the multimedia element is mainly
carried of depending in distance between the node it-
self and the multimedia element.
In this paper, we present a new source of evidence
”geometric” dedicated to multimedia retrieval which
is based on intuition that each textual node contains
information that describes semantically a multimedia
element. And the participation of each text node in the
score of a multimedia element varies with its position
in there XML document.
For compute the geometric distance, we initially
place the nodes of each XML document in a Eu-
clidean space for calculate the coordinates of each
node by the algorithm 1 defined below. Then, we
compute the score of a multimedia element depend-
ing on the distance between each textual node.
For presentation of structural information, we an-
alyzed the structure of XML documents and its repre-
sentation in the tree form and we choose a new geo-
metric metric for the representation elements of XML
document. Each node must be presented in a Eu-
clidean space and distance will be calculated between
the multimedia element and textual node.
ANewMetricforMultimediaRetrievalinStructuredDocuments
241