implement widely used techniques for IE. More specifically we considered the combi-
nation of T-Rex, a machine learning framework, and Saxon, a rule based extractor, for
addressing issues of computational cost as well as precision and recall when extracting
information from such short snippets of text. As seen in the evaluation results, the use
of a hybrid approach for extracting information from image descriptions is promising,
however levels of precision and recall could be improved by using external knowledge
for reinforcing the extractions. For instance, cases such as in the description “High-
land near Ben Nevis” could be placed in the context of the user (e.g. does s/he know
anyone called “Ben Nevis”?), the image itself (e.g. GPS positioning) or other image
descriptions within the same collection (e.g. “Ben Nevis” was previously classified as
a location/person). Another possible refinement to the approach, that has been previ-
ously applied with success in the past for the task of image annotations [1], is that of
involving the user in the process for reinforcing system decisions, such as confirming
the outcome of a conflict resolution.
Furthermore, the concepts used here are an incomplete list of those useful within an
image description. One important area for future work is extraction of further concepts
used by people to describe their images (e.g. time, events, mood, etc). Also, some ex-
traction examples, such as in the description “Vicky and dad at local bus stop” where
local bus stop is extracted as an object, suggest that certain concepts may need fur-
ther refinement. This would allow in this case for the object instance found to be also
assigned geographic properties, given the contextual information about the image.
Acknowledgements
This work was sponsored by Kodak Eastman Corporation. We would also like to thanks
the 391 online photo sharing users who donated their photos and captions.
References
1. L. Ahn and L. Dabbish. Labeling images with a computer game. In CHI ’04, pages 319–326,
New York, NY, USA, 2004. ACM Press.
2. A. Barla, F. Odone, and A. Verri. Old fashioned state-of-the-art image classification. In Proc.
of ICIAP 2003, pages 566–571, Sept 2003.
3. M. Hearst. Automatic acquisition of hyponyms from large text corpora. In Proc. of COLING
1992, pages 539–545, 1992.
4. J. Iria and F. Ciravegna. A Methodology and Tool for Representing Language Resources for
Information Extraction. In Proc. of LREC 2006, Genoa, Italy, May 2006.
5. J. Iria, N. Ireson, and F. Ciravegna. An experimental study on boundary classification algo-
rithms for information extraction using svm. In Proc. of EACL 2006, April 2006.
6. M. Naaman, S. Harada, Q. Wang, H. Garcia-Molina, and A. Paepcke. Context data in geo-
referenced digital photo collections. In Proc. of ACM MM, Oct 2004.
7. K. Pastra, H. Saggion H, and Y. Wilks. Extracting relational facts for indexing and retrieval
of crime-scene photographs, 2002.
8. R. Srihari. Automatic indexing and content-based retrieval of captioned images. Computer,
28(9):49–56, 1995.
9. R. Veltkamp and M. Tanase. Content-based image retrieval systems: A survey. Technical
Report UU-CS-2000-34, Dept. of Computing Science, Utrecht University, 2000.
5757