rithm generates a pose estimate supported with suffi-
cient confidence. For object A the two backgrounds A
and C are less problematic in terms of segmentation
noise rejection of uncertain poses reduces the mean
error as well as the number of failures. In case of
background C the neural network is able to exclude
all failures, albeit at the cost of rejecting two out three
views. Notice, that for test object B in front of back-
ground B that coincides with the object color nearly
70% of the original estimates are failures. In this case
the neural network ultimately classifies all estimates
as unreliable. Acceptable error and failure rates are
achieved for test object C in front of background B.
The mean estimation error
¯
E is small enough to al-
low an open loop grasp for 9 of 10 estimations. In
contrast pose estimation on background A fails al-
most completely with an failure rate of 72% due to
the similar colors of the object. Even if only 15 %
of the estimates are accepted, the failure rate of 42%
is still not acceptable. Instead of an open loop grasp
control based on a single image and pose estimate it
is more robust to operate in feedback mode by ac-
quiring additional images. A Kalman filter approach
fuses observed pose estimates with the known camera
motions. The experimental results demonstrate that
2DOF pose estimation based on ACCHs is feasible
under the assumption of proper segmentation. The
main drawback of the proposed method is the sensi-
tivity with respect to noise and segmentation errors.
As a 2DOF pose estimation with ACCHs is substan-
tially more difficult, the approach does not achieve the
same level of robustness as in the case of 1DOF pose
estimation based on pure CCHs.
6 CONCLUSIONS
In this paper we presented a novel approach for 2DOF
pose estimation based on angular cooccurrence his-
tograms. Under the assumption of proper object back-
ground segmentation the accuracy of estimated poses
is sufficient for object manipulation with a two-finger
grasp. The confidence rating of the match value re-
sponse by the neural network is a suitable means to
further improve the robustness of pose estimation at
the cost of a reduced recognition rate. The quality of
the appearance based segmentation deteriorates sub-
stantially in the case of overlapping objects or back-
grounds with similar colors. The degradation reflects
itself in an ambiguous match value curve detected by
the neural network. In a robotic manipulation sce-
nario the camera is moved in order to capture an im-
age of the object from a presumably better perspec-
tive. The grasping motion is not executed until a suffi-
cient confidence in the prior pose estimation has been
achieved. Our experimental results show that earlier
appearance based methods for 1 DOF pose estimation
can be extended to a 2DOF pose estimation. How-
ever, 2DOF pose estimation based on ACCHs is no
longer scale invariant and therefore requires an ap-
proximate initial estimate of scale. For our task the
reach of the robot arm is limited so that the scale does
not vary much across different configurations. There-
fore, a single training set of ACCHs captured at an
intermediate camera to object range is valid across
the entire workspace of the manipulator. An avenue
for future research is the integration of appearance
based approaches with an image based visual servo-
ing scheme. In image based visual servoing the cor-
respondance problem is prevalent in particular if are
only partially visible. To solve the correspondence
problem for visual servoing tasks the objects are of-
ten labeled with artificial landmarks like color blobs.
These approaches are therefore constrained to struc-
tured, synthetic environments. To overcome all those
limitations visual servoing is established on the entire
appearance of an object.
REFERENCES
Chang, P. and Krumm, J. (1999). Object recognition with
color cooccurrence histograms. In CVPR’99,pp. 498-
504.
Dementhon, D. and Davis, L. (1992). Model-based object
pose in 25 lines of code. In ECCV.
Ekvall, S., Kragic, D., and Hoffmann, F. (2005). Ob-
ject recognition and pose etimation using color cooc-
curence histograms and geometric modeling. In Image
and Vision Computing.
MacKay, D. (1992). The evidence framework applied to
classification networks. In Neural Computation, Vol.
4, 720-736.
Najafi, H., Genc, Y., and Navab, N. (2006). Fusion of 3d
and appearance models for fast object detection and
pose estimation. In Asian Conference on Computer
Vision.
Nierobisch, T. and Hoffmann, F. (2004). Appearance based
pose estimation of aibo’s. In International IEEE Con-
ference Mechatronics & Robotics, Proceedings Vol.3,
pp. 942-947.
Nister, D. (2003). An efficient solution of the five-point
relative pose problem. In CVPR.
Schiele, B. and Pentland, A. (1999). Probabilistic object
recognition and localization. In ICCV’99.
Shapiro, L. and Stockman, G. (2001). In Computer Vision.
Prentice Hall.
Ulrich, I. and Nourbakhsh, I. (2000). Appearance-based
place recognition for topological localization. In IEEE
ICRA, San Francisco, pp. 1023-1029.