gree) while the lower level descriptors have less re-
liability. Therefore, in our future research we pro-
pose to combine the different level features, perhaps
within a coarse-to-fine search framework, in order to
optimize the labelling performance. In this case the
largest grouping would be first matched and then the
search process repeated using successively low-level
groupings which are then matched using increasingly
constrained search bounds.
5 CONCLUSIONS
In this paper, we present a new hexagonally grouped
and rotationally invariant image descriptor, the Hex-
HoG, that can be computed recursively to generate hi-
erarchical features. Hierarchical grouping affordssuf-
ficient discriminability to allow HexHoG descriptors
to be sampled at all detected edgel positions (as op-
posed to only corner locations) in order to match edge
contours between a reference and test image. Given
an initial class and pose for a detected object, we are
then able to apply dense local HexHoG matching, to
both improve the detected object’s pose estimation
and also directly label the edge contours of the object
as they appear in a test image. Therefore our pro-
posed methodology supports segmentation-through-
matching.
Our validation experiments show that matching
HexHoG features, which are based only on appear-
ance information computed at edgel locations, has the
potential to improve the performance of object pose
estimation by approximately a factor of 2. By im-
proving the accuracy of the pose estimation process,
it is then possible to project contours from the refer-
ence image into the test image and annotate the lo-
cation of a detected object with sufficient accuracy
for many practical tasks such as grasping in robotics.
However, improvedpose estimation also improvesthe
search constraints required to match test image edge
contours directly, to allow HexHoG matching to offer
the possibility of recovering the actual edgel labels
detected in the test image that correspond to contour
edgels in the reference image, as described above.
Our results indicate that for purely affine pose
transformations, the proposed scheme can recover a
significant fraction of edgel labellings in the test im-
age. In many situations, where for example the pose
relationship between the target object contained in the
reference and test images is non-affine, e.g. for out-
of-plane rotation or under projective distortion, dense
HexHoG feature matching has the potential to main-
tain pixel-accurate correspondences between the edge
contours detected within the test and reference object
images.
Our future work will focus on incorporating an
improved edge detector, hierarchical approaches to
matching the HexHoG features and improved post-
lablling processing for determining edgel connectiv-
ity and edgel contour shape representation.
ACKNOWLEDGEMENTS
The authors acknowledge financial support from the
Chinese Scholarship Council, China, and the Eu-
ropean Union within the Strategic Research Project
Clopema, Project No. FP7-288553.
REFERENCES
Alahi, A., Ortiz, R., and Vandergheynst, P. (2012). Freak:
Fast retina keypoint. In Computer Vision and Pat-
tern Recognition (CVPR), 2012 IEEE Conference on,
pages 510–517. IEEE.
Borenstein, E. and Ullman, S. (2002). Class-specific, top-
down segmentation. In Computer VisionECCV 2002,
pages 109–122. Springer.
Borji, A. and Itti, L. (2012). Exploiting local and global
patch rarities for saliency detection. In Computer
Vision and Pattern Recognition (CVPR), 2012 IEEE
Conference on, pages 478–485. IEEE.
Brown, M., Hua, G., and Winder, S. (2011). Discriminative
learning of local image descriptors. Pattern Analy-
sis and Machine Intelligence, IEEE Transactions on,
33(1):43–57.
Dalal, N. and Triggs, B. (2005). Histograms of oriented gra-
dients for human detection. In Computer Vision and
Pattern Recognition, 2005. CVPR 2005. IEEE Com-
puter Society Conference on, volume 1, pages 886–
893. IEEE.
Felzenszwalb, P. F., Girshick, R. B., McAllester, D., and
Ramanan, D. (2010). Object detection with discrim-
inatively trained part-based models. Pattern Analy-
sis and Machine Intelligence, IEEE Transactions on,
32(9):1627–1645.
Ferrari, V., Jurie, F., and Schmid, C. (2010). From images
to shape models for object detection. International
Journal of Computer Vision, 87(3):284–303.
Geusebroek, J.-M., Burghouts, G. J., and Smeulders, A. W.
(2005). The amsterdam library of object images. In-
ternational Journal of Computer Vision, 61(1):103–
112.
Kontschieder, P., Riemenschneider, H., Donoser, M., and
Bischof, H. (2011). Discriminative learning of con-
tour fragments for object detection. In BMVC, pages
1–12.
Lazebnik, S., Schmid, C., and Ponce, J. (2006). Beyond
bags of features: Spatial pyramid matching for rec-
ognizing natural scene categories. In Computer Vi-
sion and Pattern Recognition, 2006 IEEE Computer
ContourLocalizationbasedonMatchingDenseHexHoGDescriptors
665