Photo-based Multimedia Applications using Image Features Detection

Rui Nóbrega, Nuno Correia

Abstract

This paper proposes a framework for the creation of interactive multimedia applications that take advantage of detected features from user-captured photos. The goal is to create games, architectural and space planning applications that interact with visual elements in the images such as walls, floors and empty spaces. The framework takes advantage of a semi-automatic algorithm to detect scene elements and camera parameters. Using the detected features, virtual objects can be inserted in the scene. In this paper several example applications are presented and discussed, and the reliability of the detection algorithm is compared with other systems. The presented solution analyses the photos using graph-cuts for segmentation, vanishing point detection and line analysis to detect the scene elements. The main advantage of the proposed framework is the semi-automatic creation of the tri-dimensional model to be used in mixed reality applications. This enables scenarios where the user can be responsible for the input scene without much prior knowledge or experience. The current implemented examples include a furniture positioning system and a snake game with a user-built maze in the real world. The proposed system is ideal for multimedia mobile applications where interaction is combined with the back camera of the device.

References

  1. Coughlan, J. J. M. and Yuille, A. L. A. (1999). Manhattan World : Compass Direction from a Single Image by Bayesian Inference. In Proceedings of the International Conference on Computer Vision- Volume 2 (ICCV 7899), volume 2, pages 1-10. IEEE Computer Society.
  2. Crandall, D., Owens, A., Snavely, N., and Huttenlocher, D. (2011). Discrete-continuous optimization for largescale structure from motion. In Computer Vision and Pattern Recognition (CVPR'11), 2011 IEEE Conference on, pages 3001-3008. IEEE Computer Society.
  3. Criminisi, A., Reid, I., and Zisserman, A. (2000). Single view metrology. International Journal of Computer Vision, 40(2):123-148.
  4. Debevec, P. E., Taylor, C. J., and Malik, J. (1996). Modeling and Rendering Architecture from Photographs : A hybrid geometry- and image-based approach. In Proceedings of the 23rd annual conference on Computer graphics and interactive techniques (SIGGRAPH 7896), pages 11-20. ACM.
  5. Del Pero, L., Guan, J., Brau, E., Schlecht, J., and Barnard, K. (2011). Sampling bedrooms. In Computer Vision and Pattern Recognition (CVPR'11), 2011 IEEE Conference on, pages 2009-2016. IEEE Computer Society.
  6. Forsyth, D. A. and Ponce, J. (2002). Computer Vision: A Modern Approach, volume 54. Prentice Hall.
  7. Furukawa, Y., Curless, B., Seitz, S. M., and Szeliski, R. (2009). Manhattan-world stereo. In Computer Vision and Pattern Recognition (CVPR'09), 2009 IEEE Conference on, volume 0, pages 1422-1429. IEEE Computer Society.
  8. Gould, S., Fulton, R., and Koller, D. (2009). Decomposing a Scene into Geometric and Semantically Consistent Regions. In Computer Vision, 2009 IEEE 12th International Conference on (ICCV'09), number Iccv, pages 1-8. IEEE Computer Society.
  9. Gupta, A. (2010). Blocks world revisited: Image understanding using qualitative geometry and mechanics. Computer VisionECCV 2010, 125(1-2):482-496.
  10. Gupta, A., Satkin, S., Efros, A. a., and Hebert, M. (2011). From 3D scene geometry to human workspace. In Computer Vision and Pattern Recognition (CVPR'11), 2011 IEEE Conference on, pages 1961-1968. IEEE Computer Society.
  11. Hedau, V. and Hoiem, D. (2009). Recovering the spatial layout of cluttered rooms. In Computer Vision, 2009 IEEE 12th International Conference on (ICCV'09). IEEE Computer Society.
  12. Ishikawa, T., Thangamani, K., Kourogi, M., Gee, A. P., Mayol-Cuevas, W., Hyun, J., and Kurata, T. (2011). Interactive 3-D indoor modeler for virtualizing service fields. Virtual Reality.
  13. Izadi, S., Newcombe, R. A., Kim, D., Hilliges, O., Molyneaux, D., Hodges, S., Kohli, P., Shotton, J., Davison, A. J., and Fitzgibbon, A. (2011). KinectFusion: real-time dynamic 3D surface reconstruction and interaction. In ACM SIGGRAPH 2011 Talks, UIST 7811, page 23. ACM, ACM.
  14. Karsch, K., Hedau, V., and Forsyth, D. (2011). Rendering synthetic objects into legacy photographs. ACM Trans. Graph, 30(6):1-12.
  15. Klein, G. and Murray, D. (2007). Parallel Tracking and Mapping for Small AR Workspaces. 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality, 07:1-10.
  16. Kowdle, A., Chang, Y., and Gallagher, A. (2011). Active Learning for Piecewise Planar 3D Reconstruction. In Computer Vision and Pattern Recognition (CVPR'11), 2011 IEEE Conference on, pages 24-26. IEEE Computer Society.
  17. Lai, J., Chen, C., Wu, P., and Kao, C. (2011). Tennis real play: an interactive tennis game with models from real videos. In Proceedings of the 19th ACM international conference on Multimedia - MM 7811, volume 3, pages 483-492.
  18. Lee, D. C., Hebert, M., and Kanade, T. (2009). Geometric reasoning for single image structure recovery. In Computer Vision and Pattern Recognition (CVPR'09), 2009 IEEE Conference on, volume 0, pages 2136- 2143. IEEE Computer Society.
  19. Liu, B. and Gould, S. (2010). Single image depth estimation from predicted semantic labels. In Computer Vision and Pattern Recognition (CVPR'10), 2010 IEEE Conference on. IEEE Computer Society.
  20. Nóbrega, R. and Correia, N. (2012). Magnetic augmented reality: virtual objects in your space. In Proceeding of the 2012 International Working Conference on Advanced Visual Interfaces (AVI'12), pages 3-6. ACM.
  21. Pollefeys, M., Koch, R., and Gool, L. V. (1998). Selfcalibration and metric reconstruction in spite of varying and unknown internal camera parameters. Computer Vision and Pattern Recognition (CVPR'98), 1998 IEEE Conference on, pages 90-95.
  22. Pollefeys, M., Nistér, D., Frahm, J. M., Akbarzadeh, A., Mordohai, P., Clipp, B., Engels, C., Gallup, D., Kim, S. J., Merrell, P., Salmi, C., Sinha, S., Talton, B., Wang, L., Yang, Q., Stewénius, H., Yang, R., Welch, G., and Towles, H. (2007). Detailed Real-Time Urban 3D Reconstruction from Video. International Journal of Computer Vision, 78(2-3):143-167.
  23. Rother, C. (2002). A new Approach to Vanishing Point Detection in Architectural Environments. Elsevier, (January 2002):1-17.
  24. Rother, C. and Kolmogorov, V. (2004). GrabCut Interactive Foreground Extraction using Iterated Graph Cuts. ACM Transactions on Graphics (TOG).
  25. Saxena, A., Sun, M., and Ng, A. Y. (2009). Make3D: Learning 3D Scene Structure from a Single Still Image. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(5):824-40.
  26. Simon, G. (2006). Automatic online walls detection for immediate use in AR tasks. In Mixed and Augmented Reality, 2006. ISMAR 2006., pages 4-7. IEEE Computer Society.
  27. Simon, G., Fitzgibbon, A. W., and Zisserman, A. (2000). Markerless tracking using planar structures in the scene. Proceedings IEEE and ACM International Symposium on Augmented Reality ISAR 2000, 9:120- 128.
  28. Sinha, S. N., Steedly, D., and Szeliski, R. (2009). Piecewise planar stereo for image-based rendering. 2009 IEEE 12th International Conference on Computer Vision, (Iccv):1881-1888.
  29. Szeliski, R. (2010). Computer Vision: Algorithms and Applications. Springer.
  30. Tillon, A. and Marchal, I. (2011). Mobile augmented reality in the museum: Can a lace-like technology take you closer to works of art? In Mixed and Augmented Reality - Arts, Media, and Humanities (ISMAR-AMH), 2011 IEEE International Symposium On, number Figure 1, pages 41-47. IEEE Computer Society.
  31. Turcsanyi-Szabo, M. and Simon, P. (2011). Augmenting Experiences Bridge Between Two Universities. In Augmented Reality (ISMAR), 2011, pages 7-13.
  32. Wagner, D., Reitmayr, G., Mulloni, A., Drummond, T., and Schmalstieg, D. (2010). Real-time detection and tracking for augmented reality on mobile phones. IEEE Transactions on Visualization and Computer Graphics, 16(3):355-368.
  33. Wang, H. and Gould, S. (2010). Discriminative Learning with Latent Variables for Cluttered Indoor Scene Understanding. In Proceedings of the 11th European conference on Computer vision: Part IV (ECCV'10), pages 497-510. Springer-Verlag.
  34. Yu, S. X. and Malik, J. (2008). Inferring spatial layout from a single image via depth-ordered grouping. 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pages 1-7.
Download


Paper Citation


in Harvard Style

Nóbrega R. and Correia N. (2013). Photo-based Multimedia Applications using Image Features Detection . In Proceedings of the International Conference on Computer Graphics Theory and Applications and International Conference on Information Visualization Theory and Applications - Volume 1: GRAPP, (VISIGRAPP 2013) ISBN 978-989-8565-46-4, pages 298-307. DOI: 10.5220/0004244702980307


in Bibtex Style

@conference{grapp13,
author={Rui Nóbrega and Nuno Correia},
title={Photo-based Multimedia Applications using Image Features Detection},
booktitle={Proceedings of the International Conference on Computer Graphics Theory and Applications and International Conference on Information Visualization Theory and Applications - Volume 1: GRAPP, (VISIGRAPP 2013)},
year={2013},
pages={298-307},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004244702980307},
isbn={978-989-8565-46-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Computer Graphics Theory and Applications and International Conference on Information Visualization Theory and Applications - Volume 1: GRAPP, (VISIGRAPP 2013)
TI - Photo-based Multimedia Applications using Image Features Detection
SN - 978-989-8565-46-4
AU - Nóbrega R.
AU - Correia N.
PY - 2013
SP - 298
EP - 307
DO - 10.5220/0004244702980307