native speakers will be of higher quality (quality is
defined here in terms of quantity and variety of tags
used, once errors have been eliminated); (ii) The
quality of those tags created by native speakers will
become more apparent when they have to describe
feelings evoked by the picture rather than when they
objectively describe what is seen in the picture.
If this assumption is valid, we will have
objective data to design a recommendation system in
which tags would be automatically proposed to users
based on previous tagging sessions. These previous
sessions would only be selected from users
providing high quality tags (i.e. good tags in terms
of quantity and variety). This recommendation
system would help non-native taggers to work with
tags used by native taggers.
Goals and main contributions: to the best of our
knowledge there are non or few investigators
working on support for non-native taggers of
images, and making the distinction and support for
subjective versus objective tagging, which are two
of the main lines of our work presented in this paper.
2 STATE OF THE ART AND
RELATED WORK
We ask up to what point users with different
language skill levels vary in their way of indexing
contents which are similar or the same. Specifically,
we will look at the description of images, and the
difference between tags which represent feelings,
emotions or sensations compared with tags which
represent objective descriptions of the images
(Boehner, DePaula, Dourish, Sengers,
2007)(Isbister, Hook, 2007).
In recent years tag recommendation has become
a popular area of applied research, and of
commercial interest for the major search engine and
content providers (Yahoo, Google, Microsoft,
AOL…). Different approaches have been made to
tag recommendation, such as that based on
collective knowledge (Sigurbjörnsson, van Zwol,
2008), approaches based on analysis of the images
themselves (when the tags refer to images)
(Anderson, Raghunathan, Vogel, 2008),
collaborative approaches (Lee, 2007), a classic IR
approach by analyzing folksonomies (Lipczak,
Angelova, Milios, 2008), and systems based on
personalization (Garg, Weber, 2008). With respect
to considerations of non-native users, we can cite
works such as (Sood, Hammond, Owsley,
Birnbaum, 2007). Finally we can cite approaches
based on complex statistical models, such as (Song,
2008).
3 METHODOLOGY – DESIGN OF
EXPERIMENTS FOR USER
EVALUATION
For this study we have selected 10 photographs from
Flickr. The photographs we have used have been
chosen for their contrasting images and for their
potential to require different tags for ‘see’ and
‘evoke’. Image 1 is of a person with his hands to his
face; Image 2 is of a man and a woman caressing;
Image 3 is of a small spider in the middle of a web;
Image 4 is of a group of people dancing in a circle
with a sunset in the background; Image 5 is of a lady
holding a baby in her arms; Image 6 is of a boy
holding a gun ; Image 7 is of an old tree in the
desert, bent over by the wind; Image 8 is of a hand
holding a knife; Image 9 is a photo taken from above
of a large cage with a person lying on its floor;
finally, Image 10 is of a small bench on a horizon.
We have created a web site with a questionnaire
in which the user introduces his/her demographic
data, their tags for the photographs (tag session) and
some questions which the user answers after
completing the session. The capture of tag sessions
has been carried out for native and non-native
English, and our website reference is:
http://www.tradumatica.net/bmesa/interact2007/inde
x_en.htm .
Tag Session Capture. During a tag session the users
must assign between 4 and 10 tags which are related
to the objects which they can see in the image and a
similar number of tags related to what each image
evokes for them, in terms of sensations or emotions.
With reference to Figure 1, in the first column the
user writes the tags which express what they see in
the image, while in the second column the user
writes the tags which describe what the image
evokes. We have currently accumulated a total of
162 user tag sessions from 2 different countries,
involving the tasks of description of the photographs
in English. For approximately half of the users,
English is their native language and for the other
half it is a second language.
Raw Data and Derived Factors. From the tags
collected and the information which the users have
provided, we can compare results in the English
language used by native and non natives in that
KDIR 2009 - International Conference on Knowledge Discovery and Information Retrieval
100