Pose Estimation using a Hierarchical 3D Representation of Contours and Surfaces

Anders Glent Buch, Dirk Kraft, Joni-Kristian Kämäräinen, Norbert Krüger

2013

Abstract

We present a system for detecting the pose of rigid objects using texture and contour information. From a stereo image view of a scene, a sparse hierarchical scene representation is reconstructed using an early cognitive vision system. We define an object model in terms of a simple context descriptor of the contour and texture features to provide a sparse, yet descriptive object representation. Using our descriptors, we do a search in the correspondence space to perform outlier removal and compute the object pose. We perform an extensive evaluation of our approach with stereo images of a variety of real-world objects rendered in a controlled virtual environment. Our experiments show the complementary role of 3D texture and contour information allowing for pose estimation with high robustness and accuracy.

References

  1. Bariya, P. and Nishino, K. (2010). Scale-hierarchical 3D object recognition in cluttered scenes. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pages 1657 -1664.
  2. Bay, H., Tuytelaars, T., and Gool, L. V. (2006). Surf: Speeded up robust features. In Proceedings of the ninth European Conference on Computer Vision.
  3. Belongie, S., Malik, J., and Puzicha, J. (2002). Shape matching and object recognition using shape contexts. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 24(4):509 -522.
  4. Drost, B., Ulrich, M., Navab, N., and Ilic, S. (2010). Model globally, match locally: Efficient and robust 3D object recognition. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pages 998 -1005.
  5. Fischler, M. A. and Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM, 24(6):381-395.
  6. Frome, A., Huber, D., Kolluri, R., Bulow, T., and Malik, J. (2004). Recognizing objects in range data using regional point descriptors. In Proceedings of the European Conference on Computer Vision (ECCV).
  7. Hetzel, G., Leibe, B., Levi, P., and Schiele, B. (2001). 3D object recognition from range images using local feature histograms. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR)., volume 2, pages 394-399.
  8. Jensen, L., Kjaer-Nielsen, A., Pauwels, K., Jessen, J., Van Hulle, M., and Krger, N. (2010). A two-level real-time vision machine combining coarse- and finegrained parallelism. Journal of Real-Time Image Processing, 5:291-304. 10.1007/s11554-010-0159-4.
  9. Johnson, A. and Hebert, M. (1999). Using spin images for efficient object recognition in cluttered 3D scenes. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 21(5):433 -449.
  10. Krüger, N., Felsberg, M., and Wörgötter, F. (2004). Processing multi-modal primitives from image sequences. Fourth International ICSC Symposium on Engineering of Intelligent Systems.
  11. Kuhn, H. W. (1955). The Hungarian method for the assignment problem. Naval Research Logistics Quarterly, 2(1-2):83-97.
  12. Lowe, D. (1999). Object recognition from local scaleinvariant features. In Computer Vision, 1999. The Proceedings of the Seventh IEEE International Conference on, volume 2, pages 1150 -1157 vol.2.
  13. Novatnack, J. and Nishino, K. (2008). Scaledependent/invariant local 3D shape descriptors for fully automatic registration of multiple sets of range images. In Proceedings of the 10th European Conference on Computer Vision: Part III, ECCV 7808, pages 440-453, Berlin, Heidelberg. Springer-Verlag.
  14. Papazov, C. and Burschka, D. (2010). An efficient ransac for 3D object recognition in noisy and occluded scenes. In Proceedings of the 10th Asian Conference on Computer Vision, pages 135-148. Springer-Verlag.
  15. Payet, N. and Todorovic, S. (2011). From contours to 3D object detection and pose estimation. In Computer Vision (ICCV), 2011 IEEE International Conference on, pages 983 -990.
  16. Pugeault, N., Wörgötter, F., and Krüger, N. (2010). Visual primitives: Local, condensed, and semantically rich visual descriptors and their applications in robotics. International Journal of Humanoid Robotics (Special Issue on Cognitive Humanoid Vision), 7(3):379-405.
  17. Rusu, R. B., Blodow, N., and Beetz, M. (2009). Fast point feature histograms (FPFH) for 3D registration. In Robotics and Automation, 2009. ICRA 7809. IEEE International Conference on, pages 3212 -3217.
  18. Stein, F. and Medioni, G. (1992). Structural indexing: Efficient 3-D object recognition. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 14(2):125 -145.
  19. Umeyama, S. (1991). Least-squares estimation of transformation parameters between two point patterns. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 13(4):376 -380.
  20. Wahl, E., Hillenbrand, U., and Hirzinger, G. (2003). Surfletpair-relation histograms: a statistical 3D-shape representation for rapid classification. In 3-D Digital Imaging and Modeling, 2003. 3DIM 2003. Proceedings. Fourth International Conference on, pages 474 -481.
Download


Paper Citation


in Harvard Style

Glent Buch A., Kraft D., Kämäräinen J. and Krüger N. (2013). Pose Estimation using a Hierarchical 3D Representation of Contours and Surfaces . In Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2013) ISBN 978-989-8565-47-1, pages 105-111. DOI: 10.5220/0004275801050111


in Bibtex Style

@conference{visapp13,
author={Anders Glent Buch and Dirk Kraft and Joni-Kristian Kämäräinen and Norbert Krüger},
title={Pose Estimation using a Hierarchical 3D Representation of Contours and Surfaces},
booktitle={Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2013)},
year={2013},
pages={105-111},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004275801050111},
isbn={978-989-8565-47-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2013)
TI - Pose Estimation using a Hierarchical 3D Representation of Contours and Surfaces
SN - 978-989-8565-47-1
AU - Glent Buch A.
AU - Kraft D.
AU - Kämäräinen J.
AU - Krüger N.
PY - 2013
SP - 105
EP - 111
DO - 10.5220/0004275801050111