cally translated with respect to the new position of the
control point.
In order to drive the control points to their opti-
mal 3D locations, a relation between the disparity in-
formation contained within the ROI and the control
points on the shape primitive had to be derived. This
is accomplished by estimating the surface normal of
the disparity areas with respect to the control points,
that is, each control points if moved in the direction of
the nearest disparity surface according to its normal.
The advantage of modeling the complete 3D shape
of the objects for grasping purposes plays a crucial
role in the grasping procedure. Namely, if each ob-
ject 3D point is precisely related to the pose of the
robotic system, obtained through the algorithm from
Section 2, then the control precision of autonomous
robots equipped with redundant manipulators is much
higher than for the case when object grasping points
are directly extracted from 2D visual information.
Table 1: Statistical position and orienation errors allong
the three Cartesian axes between the proposed and marker
based 3D camera pose estimation.
X
e
[m; deg] Y
e
[m; deg] Z
e
[m; deg]
Max err. 0.049; 4.2 0.059; 5.6 0.101; 10.1
Mean 0.013; 0.7 0.014; 0.7 0.042; 0.6
Std. dev. 0.021; 2.3 0.02; 2.6 0.064; 5.5
4 PERFORMANCE EVALUATION
The evaluation of the proposed machine vision system
has been performed with respect to the real 3D poses
of the objects of interest. The real 3D positions and
orientations of the objects of interest were manually
determined using the following setup. On the imaged
scene, a visual marker, considered to be the ground
truth information, was installed in such a way that the
poses of the objects could be easily measured with re-
spect to the marker. The 3D pose of the marker was
detected using the ARToolKit library which provides
subpixel accuracy estimation of the marker’s location
with an average error of ≈ 5mm. By calculating the
marker’s 3D pose, a ground truth reference value for
camera position and orientation estimation could be
obtained using the inverse of the marker’s pose ma-
trix. Further, the positions of the camera poses were
calculated using the proposed system. The results
were compared to the ground truth data provided by
the ARToolKit marker.
The marker-less pose estimation algorithm de-
scribed in this paper delivered a camera position and
orientation closely related to the ground truth va-
lues. This correlation can be easily observed when
analysing the statistical error results, given in Tab. 1,
between the two approaches. Namely, for both the
position and orientation, the errors are small enough
to ensure a good spatial localization of the camera, or
robot, and also to provide reliable depth maps fusion.
5 CONCLUSIONS
In this paper a camera pose and 3D object volumet-
ric system for service robotics purposes has been pro-
posed. Its goal is to precisely determine the 3D struc-
ture of the imaged objects of interest with respect to
the pose of the camera, that is, of the robot itself. As
future work, the authors consider the speed enhance-
ment of the proposed system using state of the art par-
allel processing equipment.
ACKNOWLEDGEMENTS
This paper is supported by the Sectoral Oper-
ational Program Human Resources Development
(SOP HRD), financed from the European So-
cial Fund and by the Romanian Government un-
der the projects POSDRU/89/1.5/S/59323 and POS-
DRU/107/1.5/S/76945.
REFERENCES
Brown, M., Burschka, D., and Hager, G. (2003). Advances
in Computational Stereo. IEEE Trans. on Pattern
Recognition and Machine Intelligence, 25(8):993–
1008.
Davies, R., Twining, C., and Taylor, C. (2008). Statisti-
cal Models of Shape: Optimisation and Evaluation.
Springer.
Geiger, A., Ziegler, J., and Stiller, C. (2011). StereoScan:
Dense 3D Reconstruction in Real-time. In IEEE Intel-
ligent Vehicles Symposium, Baden-Baden, Germany.
Grigorescu, S. and Moldoveanu, F. (2011). Controlling
Depth Estimation for Robust Robotic Perception. In
Proc. of the 18th IFAC World Congress, Milano, Italy.
Hartley, R. and Zisserman, A. (2004). Multiple View Geom-
etry in Computer Vision. Cambridge University Press.
Hussmann, S. and Liepert, T. (2007). Robot Vision Sys-
tem based on a 3D-TOF Camera. In Instrumenta-
tion and Measurement Technology Conference-IMTC
2007, Warsaw, Poland.
Zheng, Y., Barbu, A., Georgescu, B., Scheuering, M., and
Comaniciu, D. (2008). Four-Chamber Heart Model-
ing and Automatic Segmentation for 3D Cardiac CT
Volumes Using Marginal Space Learning and Steer-
able Features. IEEE Trans. on Medical Imaging,
27(11):1668–1681.
VISAPP 2012 - International Conference on Computer Vision Theory and Applications
358