Exploring Data Fusion under the Image Retrieval Domain

Nádia P. Kozievitch, Carmem Satie Hara, Jaqueline Nande, Ricardo da S. Torres


Advanced services in data compression, data storage, and data transmission have been developed and are widely used to address the required capabilities of an assortment of systems across diverse application domains. In order to reuse, integrate, unify, manage, and support heterogeneous resources, a number of works and concepts have emerged with the aim of facilitating aggregation of content and helping system developers. In particular, images, along with existing Content-Based Image Retrieval services, have the potential to play a key role in information systems, due to the large availability of images and the need to integrate them with existing collections, metadata, and available image manipulation softwares and applications. In this work, we explore a data fusion approach for solving data value conflicts in the context of image retrieval domain. In particular, we target the process of solving value conflicts resulted from different features integrating the data resulted from the Content-Based Image Retrieval process, along with the image metadata, provided from a number of sources and applications. Our approach reduces the need of human intervention for keeping a clean and integrated view of an image repository when new data sources are added to an image management system.


  1. Achananuparp, P., McCain, K. W., and Allen, R. B. (2007). Supporting student collaboration for image indexing. In ICADL'07, pages 24-34, Berlin, Heidelberg. Springer-Verlag.
  2. Akbar, S., Kung, J., and Wagner, R. (2008). Multishapefeatures and text-feature integration on 3d model similarity retrieval. Int. J. Innov. Comput. Appl., 1(3):171- 184.
  3. Bilke, A., Bleiholder, J., Naumann, F., Bö hm, C., and Weis, M. (2005). Automatic data fusion with hummer. In Proc. of the 31st VLDB Conference, pages 1251-1254.
  4. Bleiholder, J. and Naumann, F. (2008). Data fusion. ACM Comput. Surv., 41(1):1:1-1:41.
  5. Buneman, P., Davidson, S., Fan, W., Hara, C., and Tan, W.-C. (2002). Keys for XML. Computer Networks, 39(5):473-487.
  6. Burnett, I. S., Pereira, F., de Walle, R. V., and Koenen, R. (2006). The MPEG-21 Book. John Wiley & Sons.
  7. Cao, Y., Fan, W., and Yu, W. (2013). Determining the relative accuracy of attributes. In SIGMOD'13: Proc. of the ACM SIGMOD International Conference on Management of Data, pages 565-576.
  8. Carkacioglu, A. and Yarman-vural, F. (2001). Sasi: A new texture descriptor for content based image retrieval. IEEE International Conference on Image Processing, 2:137-140.
  9. Cecchin, F., Ciferri, C. D. A., and Hara, C. (2010). XML Data Fusion. In International Conference on Data Warehousing and Knowledge Discovery (DaWaK'2010).
  10. Dong, X., Berti-Equille, L., Hu, Y., and Srivastava, D. (2010). SOLOMON: Seeking the truth via copying detection. PVLDB, 3(2):1617-1620.
  11. Fan, W., Geerts, F., Tang, N., and Yu, W. (2013). Inferring data currency and consistency for conflict resolution. In ICDE'13: Proceedings of the IEEE International Conference on Data Engineering, pages 470-481.
  12. Fox, E. A. and France, R. K. (1997). Architecture of an expert system for composite document analysis, representation, and retrieval. In Readings in Information Retrieval, pages 400-412. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.
  13. Gonc¸alves, M. A. and Fox, E. A. (2002). 5SL: A Language for Declarative Specification and Generation of Digital Libraries. In JCDL 7802, pages 263-272, New York, NY, USA. ACM.
  14. Ikeda, R. and Widom, J. (2010). Panda: A system for provenance and data. IEEE Data Engineering Bulletin, 33(3):42-49.
  15. Ives, Z. G., Green, T. J., Karvounarakis, G., Taylor, N. E., Tannen, V., Talukdar, P. P., Jacob, M., and Pereira, F. (2008). The Orchestra collaborative data sharing system. SIGMOD Record, 37(3):26-32.
  16. Jochum, W., Kaiser, M., Schellner, K., and Wirl, F. (2007). Living memory annotation tool - image annotations for digital libraries. In Proc. of the 11th European conference on Research and Advanced Technology for Digital Libraries, ECDL 7807, pages 549-550, Berlin, Heidelberg. Springer-Verlag.
  17. Karpovich, J. F., Grimshaw, A. S., and French, J. C. (1994). Extensible file system (elfs): an object-oriented approach to high performance file i/o. ACM SIGPLAN Notices, 29(10):191-204.
  18. Kozievitch, N. P., Almeida, J., da S. Torres, R., Santanchè, A., Leite, N. J., Murthy, U., and Fox, E. A. (2012). Reusing a compound-based infrastructure for searching and annotating video stories. International Journal of Multimedia Technology, 2:89-97.
  19. Kozievitch, N. P., Almeida, J., Torres, R. S., Leite, N. A., Gonc¸alves, M. A., Murthy, U., and Fox, E. A. (2011a). Towards a Formal Theory for Complex Objects and Content-Based Image Retrieval. JIDM, 2(3):321-336.
  20. (2011b). An infrastructure for searching and harvesting complex image objects. The Information - Interaction - Intelligence (I3) Journal, 11(2):39-68.
  21. Kozievitch, N. P., Torres, R. d. S., Andrade, F., Murthy, U., Fox, E., and Hallerman, E. (2010). A teaching tool for parasitology: enhancing learning with annotation and image retrieval. In ECDL'10, pages 466-469, Berlin, Heidelberg. Springer-Verlag.
  22. Lagoze, C., Payette, S., Shin, E., and Wilper, C. (2006). Fedora: an architecture for complex objects and their relationships. Int. J. Digit. Libr., 6:124-138.
  23. Lim, E., Srivastava, J., Prabhakar, S., and Richardson, J. (1996). Entity identification in database integration. Information Sciences, 89(1).
  24. Menestrina, D., Benjelloun, O., and Garcia-Molina, H. (2006). Generic entity resolution with data confidences. In Proc. of VLDB Work. on Clean Databases.
  25. Motro, A. and Anokhin, P. (2006). Fusionplex: resolution of data inconsistencies in the integration of heterogeneous information sources. Information Fusion, 7(2):176-196.
  26. Murthy, U., Kozievitch, N. P., Leidig, J., da S. Torres, R., Yang, S., Goncalves, M., Delcambre, L., Archer, D., and Fox, E. A. (2010). Extending the 5S Framework of Digital Libraries to support Complex Objects, Superimposed Information, and Content-Based Image Retrieval Services. Technical Report TR-10-05, Virginia Tech, Department of Computer Science.
  27. Nanni, L., Brahnam, S., and Lumini, A. (2011). Combining different local binary pattern variants to boost performance. Expert Syst. Appl., 38(5):6209-6216.
  28. Nelson, L. and de Sompel, H. V. (2006). IJDL special issue on complex digital objects: Guest editors' introduction. International Journal of Digital Libraries, 6(2):113-114.
  29. Nelson, M. L., Argue, B., Efron, M., Denn, S., and Pattuelli, M. C. (2001). A survey of complex object technologies for digital libraries. Technical report, NASA/TM2001-211426.
  30. Poggi, A. and Abiteboul, S. (2005). XML data integration with identification. In Proc. of DBPL, pages 106-121.
  31. Raman, V. and Hellerstein, J. M. (2001). Potter's wheel: An interactive data cleaning system. In VLDB 7801: Proceedings of the 27th International Conference on Very Large Data Bases, pages 381-390.
  32. Santanchè, A. and Medeiros, C. B. (2007). A Component Model and Infrastructure for a Fluid Web. IEEE Transactions on Knowledge and Data Engineering, 19(2):324-341.
  33. Santanchè, A., Medeiros, C. B., and Pastorello Jr, G. Z. (2007). User-author centered multimedia building blocks. Multimedia Systems, 12(4):403-421.
  34. Stehling, R. O., Nascimento, M. A., and Falca˜o, A. X. (2002). A compact and efficient image retrieval approach based on border/interior pixel classification. In CIKM 7802, pages 102-109, New York, NY, USA. ACM.
  35. Torres, R. d. S., Medeiros, C. B., Gonc¸alves, M., and Fox, E. A. (2006). A Digital Library Framework for Biodiversity Information Systems. International Journal on Digital Libraries, 6(1):3-17.
  36. Weis, M. and Manolescu, I. (2007). Declarative XML data cleaning with XClean. In International Conf. on Advanced Information Systems Engineering (CaiSE), pages 96-110.
  37. Williams, K. and Suleman, H. (2003). A survey of digital library aggregation services. In Scholarship at Penn Libraries, available at http://works.bepress.com/martha brogan/10.
  38. Yin, X., Han, J., and Yu, P. S. (2008). Truth discovery with multiple conflicting information providers on the web. IEEE Transactions on Knowledge and Data Engineering, 20(6):796-808.
  39. Zhu, Q., Gonc¸alves, M. A., and Fox, E. A. (2003). 5SGraph demo: a graphical modeling tool for digital libraries. JCDL 7803, pages 385-385, Washington, DC, USA. IEEE Computer Society.

Paper Citation

in Harvard Style

P. Kozievitch N., Satie Hara C., Nande J. and da S. Torres R. (2014). Exploring Data Fusion under the Image Retrieval Domain . In Proceedings of the 16th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-027-7, pages 171-178. DOI: 10.5220/0004869901710178

in Bibtex Style

author={Nádia P. Kozievitch and Carmem Satie Hara and Jaqueline Nande and Ricardo da S. Torres},
title={Exploring Data Fusion under the Image Retrieval Domain},
booktitle={Proceedings of the 16th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},

in EndNote Style

JO - Proceedings of the 16th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - Exploring Data Fusion under the Image Retrieval Domain
SN - 978-989-758-027-7
AU - P. Kozievitch N.
AU - Satie Hara C.
AU - Nande J.
AU - da S. Torres R.
PY - 2014
SP - 171
EP - 178
DO - 10.5220/0004869901710178