Human-centered Region Selection and Weighting for Image Retrieval

Jean Martinet

Abstract

We present an application of gaze tracking to image and video indexing, in the form of a model for selecting and weighting Regions of Interest (RoIs). Image/video indexing refers to the process of creating a synthetic representation of the media, for instance for retrieval purposes. It usually consists in labeling the media with semantic keywords describing its content. When automatized, this process is based on the analysis of visual features, which can be extracted either from the whole image or keyframe, or locally from regions. Since most of the times the whole image is not relevant for indexing (e.g. large flat regions with no specific semantic interpretation, blur regions, background regions that may not be relevant for retrieval purposes, and that should be filtered out), it would be preferable to concentrate the labeling process on specific RoIs that are considered representative of the scene, like the main subjects. The objective of the work presented here is to take advantage of natural human gaze information in order to define a human-centered Region of Interest selection and weighting technique in the context of media retrieval.

References

  1. Baeza-Yates, R. and Ribeiro-Neto, B. (1999). Modern Information Retrieval. Addison-Wesley.
  2. Itti, L. and Koch, C. (1999). Learning to detect salient objects in natural scenes using visual attention. In In Image Understanding Workshop.
  3. Itti, L., Koch, C., and Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11):1254-1259.
  4. Jacob, R. J. and Karn, K. S. (2004). Eye tracking in humancomputer interaction and usability research: Ready to deliver the promises. In Elsevier Science, Oxford, U., editor, The Mind's Eyes: Cognitive and Applied Aspects of Eye Movements.
  5. Jing, F., Li, M., jiang Zhang, H., and Zhang, B. (2002). Learning region weighting from relevance feedback in image retrieval. In in Image Retrieval, Proc. the 27th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP.
  6. Martinet, J., Satoh, S., Chiaramella, Y., and Mulhem, P. (2008). Media objects for user-centered similarity matching. Multimedia Tools and Applications, Special Issue on Multimedia Semantics.
  7. Nguyen, A., Chandran, V., and Sridharan, S. (2006). Gaze tracking for region of interest coding in jpeg 2000. Signal Processing: Image Communication, 21(5):359-377.
  8. Osberger, W. and Maeder, A. J. (1998). Automatic identification of perceptually important regions in an image using a model of the human visual system. In International Conference on Pattern Recognition, Brisbane, Australia.
  9. Poole, A., Ball, L. J., and Phillips, P. (2004). In search of salience: A response-time and eye-movement analysis of bookmark recognition. In Conference on HumanComputer Interaction (HCI), pages 19-26.
  10. Salton, G. (1971). The SMART Retrieval System. Prentice Hall.
  11. Stentiford, F. (2003). An attention based similarity measure with application to content based information retrieval.
  12. Wang, J., J.L., and Wiederhold, G. (2001). SIMPLIcity: Semantics-sensitive Integrated Matching for picture LIbraries. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(9):947-963.
  13. Wang, J. Z. and Du, Y. (2001). Rf x ipf: A weighting scheme for multimedia information retrieval. In ICIAP, pages 380-385.
  14. Yarbus, A. L. (1967). Eye Movements and Vision. Plenum Press, New York.
Download


Paper Citation


in Harvard Style

Martinet J. (2013). Human-centered Region Selection and Weighting for Image Retrieval . In Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2013) ISBN 978-989-8565-47-1, pages 729-734. DOI: 10.5220/0004348707290734


in Bibtex Style

@conference{visapp13,
author={Jean Martinet},
title={Human-centered Region Selection and Weighting for Image Retrieval},
booktitle={Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2013)},
year={2013},
pages={729-734},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004348707290734},
isbn={978-989-8565-47-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2013)
TI - Human-centered Region Selection and Weighting for Image Retrieval
SN - 978-989-8565-47-1
AU - Martinet J.
PY - 2013
SP - 729
EP - 734
DO - 10.5220/0004348707290734