Absolute Spatial Context-aware Visual Feature Descriptors for Outdoor Handheld Camera Localization - Overcoming Visual Repetitiveness in Urban Environments

Daniel Kurz, Peter Georg Meier, Alexander Plopski, Gudrun Klinker

Abstract

We present a framework that enables 6DoF camera localization in outdoor environments by providing visual feature descriptors with an Absolute Spatial Context (ASPAC). These descriptors combine visual information from the image patch around a feature with spatial information, based on a model of the environment and the readings of sensors attached to the camera, such as GPS, accelerometers, and a digital compass. The result is a more distinct description of features in the camera image, which correspond to 3D points in the environment. This is particularly helpful in urban environments containing large amounts of repetitive visual features. Additionally, we describe the first comprehensive test database for outdoor handheld camera localization comprising of over 45,000 real camera images of an urban environment, captured under natural camera motions and different illumination settings. For all these images, the dataset not only contains readings of the sensors attached to the camera, but also ground truth information on the full 6DoF camera pose, and the geometry and texture of the environment. Based on this dataset, which we have made available to the public, we show that using our proposed framework provides both faster matching and better localization results compared to state-of-the-art methods.

References

  1. Arth, C., Mulloni, A., and Schmalstieg, D. (2012). Exploiting Sensors on Mobile Phones to Improve Wide-Area Localization. In Proc. Int. Conf. on Pattern Recognition (ICPR).
  2. Baatz, G., Kö ser, K., Chen, D., Grzeszczuk, R., and Pollefeys, M. (2012). Leveraging 3d city models for rotation invariant place-of-interest recognition. Int. Journal of Computer Vision (IJCV), 96(3):315-334.
  3. Chittaro, L. and Burigat, S. (2005). Augmenting audio messages with visual directions in mobile guides: an evaluation of three approaches. In Proc. Int. Conf. on Human Computer Interaction with Mobile Devices and Services (Mobile HCI).
  4. Chum, O. and Matas, J. (2005). Matching with PROSAC - Progressive Sample Consensus. In Proc. Int. Conf. on Computer Vision and Pattern Recognition (CVPR).
  5. Fritz, M., Saenko, K., and Darrell, T. (2010). Size matters: Metric visual search constraints from monocular metadata. In Advances in Neural Information Processing Systems (NIPS).
  6. Irschara, A., Zach, C., Frahm, J.-M., and Bischof, H. (2009). From structure-from-motion point clouds to fast location recognition. In Proc. Int. Conf. on Computer Vision and Pattern Recognition (CVPR).
  7. Klein, G. and Murray, D. (2009). Parallel tracking and mapping on a camera phone. In Proc. Int. Symp. on Mixed and Augmented Reality (ISMAR).
  8. Knopp, J., Sivic, J., and Pajdla, T. (2010). Avoiding confusing features in place recognition. In Proc. European Conf. on Computer Vision (ECCV).
  9. Kurz, D. and Benhimane, S. (2011). Inertial sensor-aligned visual feature descriptors. In Proc. Int. Conf. on Computer Vision and Pattern Recognition (CVPR).
  10. Kurz, D., Meier, P., Plopski, A., and Klinker, G. (2013). An Outdoor Ground Truth Evaluation Dataset for SensorAided Visual Handheld Camera Localization. In Proc. Int. Symp. on Mixed and Augmented Reality (ISMAR).
  11. Kurz, D., Olszamowski, T., and Benhimane, S. (2012). Representative Feature Descriptor Sets for Robust Handheld Camera Localization. In Proc. Int. Symp. on Mixed and Augmented Reality (ISMAR).
  12. Lieberknecht, S., Benhimane, S., Meier, P., and Navab, N. (2009). A dataset and evaluation methodology for template-based tracking algorithms. In Proc. Int. Symp. on Mixed and Augmented Reality (ISMAR).
  13. Lowe, D. G. (2004). Distinctive image features from scaleinvariant keypoints. Int. Journal of Computer Vision (IJCV), 60(2):91-110.
  14. Reitmayr, G. and Drummond, T. W. (2007). Initialisation for visual tracking in urban environments. In Proc. Int. Symp. on Mixed and Augmented Reality (ISMAR).
  15. Rosten, E. and Drummond, T. (2006). Machine learning for high-speed corner detection. In Proc. European Conf. on Computer Vision (ECCV).
  16. Smith, E. R., Radke, R. J., and Stewart, C. V. (2012). Physical scale keypoints: Matching and registration for combined intensity/range images. Int. Journal of Computer Vision (IJCV), 97(1):2-17.
  17. Sturm, J., Engelhard, N., Endres, F., Burgard, W., and Cremers, D. (2012). A Benchmark for the Evaluation of RGB-D SLAM Systems. In Proc. Int. Conf. on Intelligent Robot Systems (IROS).
  18. Ventura, J. and Hö llerer, T. (2012). Wide-area scene mapping for mobile visual tracking. In Proc. Int. Symp. on Mixed and Augmented Reality (ISMAR).
  19. Wulf, O., Nuchter, A., Hertzberg, J., and Wagner, B. (2007). Ground truth evaluation of large urban 6D SLAM. In Proc. Int. Conf. on Intelligent Robot Systems (IROS).
  20. Zhang, Z. (2000). A flexible new technique for camera calibration. Trans. on Pattern Analysis and Machine Intelligence (TPAMI), 22(11):1330-1334.
Download


Paper Citation


in Harvard Style

Kurz D., Meier P., Plopski A. and Klinker G. (2014). Absolute Spatial Context-aware Visual Feature Descriptors for Outdoor Handheld Camera Localization - Overcoming Visual Repetitiveness in Urban Environments . In Proceedings of the 9th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2014) ISBN 978-989-758-004-8, pages 56-67. DOI: 10.5220/0004683300560067


in Bibtex Style

@conference{visapp14,
author={Daniel Kurz and Peter Georg Meier and Alexander Plopski and Gudrun Klinker},
title={Absolute Spatial Context-aware Visual Feature Descriptors for Outdoor Handheld Camera Localization - Overcoming Visual Repetitiveness in Urban Environments},
booktitle={Proceedings of the 9th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2014)},
year={2014},
pages={56-67},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004683300560067},
isbn={978-989-758-004-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 9th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2014)
TI - Absolute Spatial Context-aware Visual Feature Descriptors for Outdoor Handheld Camera Localization - Overcoming Visual Repetitiveness in Urban Environments
SN - 978-989-758-004-8
AU - Kurz D.
AU - Meier P.
AU - Plopski A.
AU - Klinker G.
PY - 2014
SP - 56
EP - 67
DO - 10.5220/0004683300560067