Indexing Multimedia Content for Textual Querying: A Multimodal Approach

Abdesalam Amrane, Hakima Mellah, Youssef Amghar, Rachid Aliradi


Multimedia retrieval approaches are classified into three categories: those using textual information, and those using low-level information and those that combine different information extracted from multimedia. Each approach has its advantages and disadvantages as well to improving multimedia retrieval systems. The recent works are oriented towards multimodal approaches. It is in this context that we propose an approach that combines the surrounding text with the information extracted from the visual content of multimedia and represented in the same repository in order to allow querying multimedia content based on keywords or concepts. Each word contained in queries or in description of multimedia is disambiguated by using the WordNet in order to define its semantic concept.


  1. Bannour, H., A Survey of Image Retrieval Approaches and their limitations, Report of Laboratoire Mathématiques Appliquées aux Systèmes, 2009.
  2. Belkhatir, M., Mulhem, P. and Chiaramella, Y., A Conceptual Image Retrieval Architecture Combining Keyword-Based Querying with Transparent and Penetrable Query-by-Example, Springer-Verlag, pp. 528-539, 2005.
  3. Bertini, M., Torniai, C. and Del Bimbo, A., Automatic video annotation using ontologies extended with visual information, ACM Multimedia, pp. 395-398, 2005
  4. Brilhault, A., Indexation et recherche par le contenu de documents vidéos, Grenoble, France, 2009.
  5. Chbeir, R., Modélisation de la description d'images : Application au domaine médical, 2001.
  6. Clinchant, S., Ah-Pine, J. and Csurka, G., Semantic combination of textual and visual information in multimedia retrieval, Proceedings of ICMR, pp. 44-44, 2011
  7. Del Bimbo, A., Visual Information Retrieval, Morgan Kaufmann, 1999.
  8. Favetta, F. and Aufaure-Portier, M., About Ambiguities in Visual GIS Query Languages: a Taxonomy and Solutions, Proceedings of the Fourth International Conference on Visual Information Systems (VISUAL'2000), pp. 154-165, 2000.
  9. Heng, T. S., Beng, C. O. and Kian, L. T., Giving meanings to WWW images, The eighth ACM international conference on Multimedia, p. 39-47, 2000.
  10. Hollink, L. and Worring, M., Building a visual ontology for video retrieval, Proceedings of ACM Multimedia, pp. 479-482, 2005.
  11. Hoogs, A., Rittscher, J., Stein, G. and Schmiederer, J., Video content annotation using visual analysis and a large semantic knowledgebase, IEEE Int'l Conf. on Computer Vision and Pattern Recognition, vol. 2, pp. 327-334, 2003.
  12. Jonathon, S.H., Patrick, S., Paul, L., Kirk, M., Peter, E. and Christine, S., Bridging the Semantic Gap in Multimedia Information Retrieval, Workshop on Mastering the Gap, From Information Extraction to Semantic Representation, 2006.
  13. Kilgarriff, A. and Rosenzweig, R., Framework and results for English SENSEVAL, Computers and the Humanities, p. 15-48, 2000.
  14. Kim, H., Roczniak, A., Lévy, P. and El-Saddik, A., Social media filtering based on collaborative tagging in semantic space, Multimedia Tools Appl, pp. 63-89, 2012.
  15. Lemaitre, C., Moulin, C., Barat C, C. and Ducottet, C., Combinaison d'information visuelle et textuelle pour la recherche d'information multimédia, GRETSI2009, 2009.
  16. Liu, Y., Zhang, D., Lu, G. and Ma, W., Asurvey of content-based image retrieval with high-level semantics, Elsevier J. Pattern Recognition, no. 40, pp. 262-282, 2007.
  17. Mulhem, P., Lim, J.H., Leow, W.K. and Kankanhalli, M., Advances in digital home photo albums, 2004.
  18. Nowak, S., Hanbury, A. and Deselaers, T., Object and Concept Recognition for Image Retrieval, vol. The Information Retrieval Series, no. 32, 2010.
  19. Porter, M. F., An Algorithm for Suffix Stripping, Program, vol. 14, no. 3, pp. 130-137, 1980.
  20. Snoek, C. G. M., Huurnink, B., Hollink, L. and Rijke, M., Adding Semantics to Detectors for Video Retrieval, IEEE Transactions on Multimedia, vol. 9, no. 5, pp. 975-986, 2007.
  21. Snoek, C. G. M. and Worring, M., Concept-based video retrieval, Foundations and Trends in Information Retrieval, p. 215-322, 2009.
  22. Tollari, S., Detyniecki, M., Marsala, C., Tabrizi, A., Amini, M. and Gallinari, P., Exploiting Visual Concepts to Improve Text-Based Image Retrieval, ECIR 7809 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval, pp. 701-705, 2009.
  23. Torjmen, M., Approches de Recherche Multimedia dans des Documents Semi-Structurés : Utilisation du contexte textuel et structurel pour la sélection d'objets multimedia, 2009
  24. Vapnik, V. N., The Nature of Statistical Learning Theory, New York, 1999.
  25. Vasilescu, F., Désambiguïsation de corpus monolingues par des approches de type Lesk, 2003.

Paper Citation

in Harvard Style

Amrane A., Mellah H., Amghar Y. and Aliradi R. (2013). Indexing Multimedia Content for Textual Querying: A Multimodal Approach . In Proceedings of the 2nd International Workshop on Web Intelligence - Volume 1: WEBI, (ICEIS 2013) ISBN 978-989-8565-63-1, pages 3-12. DOI: 10.5220/0004576200030012

in Bibtex Style

author={Abdesalam Amrane and Hakima Mellah and Youssef Amghar and Rachid Aliradi},
title={Indexing Multimedia Content for Textual Querying: A Multimodal Approach},
booktitle={Proceedings of the 2nd International Workshop on Web Intelligence - Volume 1: WEBI, (ICEIS 2013)},

in EndNote Style

JO - Proceedings of the 2nd International Workshop on Web Intelligence - Volume 1: WEBI, (ICEIS 2013)
TI - Indexing Multimedia Content for Textual Querying: A Multimodal Approach
SN - 978-989-8565-63-1
AU - Amrane A.
AU - Mellah H.
AU - Amghar Y.
AU - Aliradi R.
PY - 2013
SP - 3
EP - 12
DO - 10.5220/0004576200030012