![](bg3.png)
We need spatial relations and composition
relations of visual nature which are direct
interpretation of what we see.
Composition relations are presented by one
relation – INCLUDES. The surface of photograph is
divided into areas that are formed by boundaries of
depicted objects. Area of one object can enclose area
of another object, e.g., (SKY ((FLYING-BIRD
(WINGS)) SUN)). Objects that are inside the area of
another object are “included” in it. For relations of
partly included objects, e.g., OBJECTS standing on
GROUND, we use our knowledge about what can
“stand on” and what can be “localization”.
Describing a picture in NL we often divide it into
“layers” – groups of objects that are in equal
distance from the viewer, e.g., “in the foreground we
see a group of people, on the back – a street”. The
concept LAYER presents these groups of objects.
Practically people use from 0 to 2 (“foreground” and
“background”) layers but in some cases picture
description can contain more layers.
Objects whose areas are not intercrossed or that
are intercrossed not as “an object standing on the
GROUND” are related by spatial relations, which
are usually bidirectional, e.g.:
To-the-left(X, Y) / To-the-right(Y, X)
Near(X, Y) / Near(Y, X)
Around(Y, X) / In-the-centre(Y, X)
We consider IM as a kind of visual specification
with elements of semantic interpretation. Both can
be of different level of discriminating – general vs.
more detailed description.
4 ONTOLOGY
We define classes of objects designed by English
words, e.g., HOUSE, TREE, SMOKE-STACK,
FIELD. In IMs we use instances of classes, their
visual parameters and meanings, e.g,, COLOR:
BLACK, BLUE, etc.; SIZE: SMALLinWIDTH,
BIG. etc.; SHAPE: SQUARE, ROND, etc.
We also need hypernyms for the classes. They
can be used in the situations of visual haziness or for
the second nomination of the same entity in the text,
e.g., for classes EDIFICE, CHAPEL, BARN and
CABIN we need a superclass BUILDING which is
related with them by “is-a” relation. Hypernyms can
be extracted from existing ontologies, e.g.,
WordNet. But concept descriptions in WordNet are
not valid to our purposes since they are mostly
functional, e.g.:
DOOR is-a “movable barrier (a barrier that can
be moved to allow passage).
We can also use ontological relations “part
holonym – part-meronym”. For classes WALL,
ROOF, WINDOW and DOOR part holonym class is
BUILDING, EDIFICE. For BUILDING among its
part meronyms are WALL, ROOF, PORCH. So we
need Classes and Superclasses that correspond to
the “is-a” relation, one Composition relation “part”
that has two terminals - part-holonym and part-
meronym, and a number of Spatial relations.
The resulted ontology should be reasonable easy
to use it for composing IMs. So, it is not a good idea
to make it possible e.g. to choose a visual parameter
of an object from the whole set of visual parameters
of any class, or to choose possible relation from the
whole set of relations between any of classes in the
ontology. Thus we need to invent a kind of filter that
controls that the proper object is supported with a
proper set of visual parameters and relations.
This filter presents description of subject matter
as information prepared for communication, where
every object is presented in some cognitive
perspective (CP). CPs are containers of visual
parameters and participants of relations. Class
consists of one or several CPs, e.g. class RIVER
consists of SURFACE and MIRROR CPs. Instance
of a Class can be assigned in IM manually with an
extra CP if it performs not typical role in the picture,
e.g. if a SCARF is used as a SKIRT we need to
combine in IM these two CPs both.
The ontology can be used in two paradigms:
image recognition and NLG of picture descriptions.
Here we pay attention only on NLG paradigm. For
the recognition process we need some reasoning.
5 EXPERIMENT
Ontology aught to be implemented in OWL (Web
Ontology Language) because it is based on paradigm
of XML, which is convenient for processing and
specially prepared to describe ontologies.
The standard solution to manage OWL
descriptions by Altova SemanticWorks failed,
because the system is not optimized enough to our
task specifics. Another tool SemTalk2 has a user-
friendly editor for Semantic Web ontologies.
SemTalk2 is based on Visio Diagrams that are used
to introduce new objects to the ontology or IM. It
supports two types of diagrams: Class Diagrams –
for description of ontology and Instance Diagrams –
for description of IMs. Maintenance of ontology is
supported not only by Diagrams but also by
hierarchic View and importing external models is
allowed.
KEOD 2009 - International Conference on Knowledge Engineering and Ontology Development
378