OCCLUSION HANDLING FOR THE INTEGRATION OF VIRTUAL OBJECTS INTO VIDEO

Kai Cordes, Björn Scheuermann, Bodo Rosenhahn, Jörn Ostermann

Abstract

This paper demonstrates how to effectively exploit occlusion and reappearance information of feature points in structure and motion recovery from video. Due to temporary occlusion with foreground objects, feature tracks discontinue. If these features reappear after their occlusion, they are connected to the correct previously discontinued trajectory during sequential camera and scene estimation. The combination of optical flow for features in consecutive frames and SIFT matching for the wide baseline feature connection provides accurate and stable feature tracking. The knowledge of occluded parts of a connected feature track is used to feed a segmentation algorithm which crops the foreground image regions automatically. The resulting segmentation provides an important step in scene understanding which eases integration of virtual objects into video significantly. The presented approach enables the automatic occlusion of integrated virtual objects with foreground regions of the video. Demonstrations show very realistic results in augmented reality.

References

  1. Apostoloff, N. E. and Fitzgibbon, A. W. (2005). Learning spatiotemporal t-junctions for occlusion detection. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), volume 2, pages 553-559.
  2. Apostoloff, N. E. and Fitzgibbon, A. W. (2006). Automatic video segmentation using spatiotemporal t-junctions. In British Machine Vision Conference (BMVC).
  3. Brox, T. and Malik, J. (2010). Object segmentation by long term analysis of point trajectories. In Daniilidis, K., Maragos, P., and Paragios, N., editors, European Conference on Computer Vision (ECCV), volume 6315 of Lecture Notes in Computer Science (LNCS), pages 282-295. Springer.
  4. Brox, T. and Malik, J. (2011). Large displacement optical flow: Descriptor matching in variational motion estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 33(3):500 -513.
  5. Cordes, K., Müller, O., Rosenhahn, B., and Ostermann, J. (2011). Feature trajectory retrieval with application to accurate structure and motion recovery. In Bebis, G., editor, Advances in Visual Computing, 7th International Symposium (ISVC), Lecture Notes in Computer Science (LNCS), volume 6938, pages 156-167. Springer.
  6. Cornelis, K., Verbiest, F., and Van Gool, L. (2004). Drift detection and removal for sequential structure from motion algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 26(10):1249- 1259.
  7. Dickscheid, T., Schindler, F., and Fö rstner, W. (2010). Coding images with local features. International Journal of Computer Vision (IJCV), 94(2):1-21.
  8. Engels, C., Fraundorfer, F., and Nistér, D. (2008). Integration of tracked and recognized features for locally and globally robust structure from motion. In VISAPP (Workshop on Robot Perception), pages 13-22.
  9. Guan, L., Franco, J.-S., and Pollefeys, M. (2007). 3d occlusion inference from silhouette cues. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1 -8.
  10. Hartley, R. I. and Zisserman, A. (2003). Multiple View Geometry. Cambridge University Press, second edition.
  11. Hillman, P., Lewis, J., Sylwan, S., and Winquist, E. (2010). Issues in adapting research algorithms to stereoscopic visual effects. In IEEE International Conference on Image Processing (ICIP), pages 17 -20.
  12. Liu, C., Yuen, J., and Torralba, A. (2011). Sift flow: Dense correspondence across scenes and its applications. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 33(5):978 -994.
  13. Lowe, D. G. (2004). Distinctive Image Features from ScaleInvariant Keypoints. International Journal of Computer Vision (IJCV), 60(2):91-110.
  14. Lucas, B. and Kanade, T. (1981). An iterative image registration technique with an application to stereo vision. In International Joint Conference on Artificial Intelligence (IJCAI), pages 674-679.
  15. Matas, J., Chum, O., Urban, M., and Pajdla, T. (2002). Robust wide baseline stereo from maximally stable extremal regions. In British Machine Vision Conference (BMVC), volume 1, pages 384-393.
  16. Pollefeys, M., Gool, L. V. V., Vergauwen, M., Verbiest, F., Cornelis, K., Tops, J., and Koch, R. (2004). Visual modeling with a hand-held camera. International Journal of Computer Vision (IJCV), 59(3):207-232.
  17. Scheuermann, B. and Rosenhahn, B. (2011). Slimcuts: Graphcuts for high resolution images using graph reduction. In Boykov, Y., Kahl, F., Lempitsky, V. S., and Schmidt, F. R., editors, Energy Minimization Methods in Computer Vision and Pattern Recognition (EMMCVPR), volume 6819 of Lecture Notes in Computer Science (LNCS). Springer.
  18. Sheikh, Y., Javed, O., and Kanade, T. (2009). Background subtraction for freely moving cameras. In IEEE International Conference on Computer Vision and Pattern Recognition (ICCV), pages 1219-1225.
  19. Thormählen, T., Hasler, N., Wand, M., and Seidel, H.- P. (2010). Registration of sub-sequence and multicamera reconstructions for camera motion estimation. Journal of Virtual Reality and Broadcasting, 7(2).
  20. Triggs, B., McLauchlan, P. F., Hartley, R. I., and Fitzgibbon, A. W. (2000). Bundle adjustment - a modern synthesis. In Proceedings of the International Workshop on Vision Algorithms: Theory and Practice, IEEE International Conference on Computer Vision and Pattern Recognition (ICCV), pages 298-372. Springer.
  21. Zhang, G., Dong, Z., Jia, J., Wong, T.-T., and Bao, H. (2010). Efficient non-consecutive feature tracking for structure-from-motion. In Daniilidis, K., Maragos, P., and Paragios, N., editors, European Conference on Computer Vision (ECCV), volume 6315 of Lecture Notes in Computer Science (LNCS), pages 422-435.
Download


Paper Citation


in Harvard Style

Cordes K., Scheuermann B., Rosenhahn B. and Ostermann J. (2012). OCCLUSION HANDLING FOR THE INTEGRATION OF VIRTUAL OBJECTS INTO VIDEO . In Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2012) ISBN 978-989-8565-04-4, pages 173-180. DOI: 10.5220/0003856601730180


in Bibtex Style

@conference{visapp12,
author={Kai Cordes and Björn Scheuermann and Bodo Rosenhahn and Jörn Ostermann},
title={OCCLUSION HANDLING FOR THE INTEGRATION OF VIRTUAL OBJECTS INTO VIDEO},
booktitle={Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2012)},
year={2012},
pages={173-180},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003856601730180},
isbn={978-989-8565-04-4},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2012)
TI - OCCLUSION HANDLING FOR THE INTEGRATION OF VIRTUAL OBJECTS INTO VIDEO
SN - 978-989-8565-04-4
AU - Cordes K.
AU - Scheuermann B.
AU - Rosenhahn B.
AU - Ostermann J.
PY - 2012
SP - 173
EP - 180
DO - 10.5220/0003856601730180