(i.e., depth images) by reducing the matching space
between the 2D intensity and 3D depth images. The
depth images rendered of a 3D model were represen-
ted with the curvilinear saliency features. In addition,
an accurate representation based on multi-scale curvi-
linear saliency with focus features was used to reduce
the effect of texture and background on the extrac-
ted features of an intensity image. The depth ima-
ges were clustered with a rule-based clustering met-
hod. The features of each cluster of depth images
were used to train a multi-class SVM for estimating
a group of depth images that are close to the input in-
tensity image. The matching between the input image
and the predicted class was then performed to esti-
mate the correct 3D pose. The RANSAC algorithm
was used to refine and verify the final viewpoint. The
effectiveness of the proposed system has been evalu-
ated on the public PASCAL3D+ dataset. The propo-
sed 2D/3D registration algorithm yielded promising
results with a high precision rate and acceptable com-
putational timing. Future work aims to extend the pre-
sented 2D/3D registration algorithm using a deep le-
arning system.
REFERENCES
Aubry, M., Maturana, D., Efros, A. A., Russell, B. C., and
Sivic, J. (2014). Seeing 3d chairs: exemplar part-
based 2d-3d alignment using a large dataset of cad
models. In Proceedings of the IEEE conference on
computer vision and pattern recognition, pages 3762–
3769.
Campbell, R. J. and Flynn, P. J. (2001). A survey of free-
form object representation and recognition techni-
ques. Computer Vision and Image Understanding,
81(2):166–210.
Choy, C. B., Stark, M., Corbett-Davies, S., and Savarese, S.
(2015). Enriching object detection with 2d-3d regis-
tration and continuous viewpoint estimation. In Com-
puter Vision and Pattern Recognition (CVPR), 2015
IEEE Conference on, pages 2512–2520. IEEE.
Fischler, M. A. and Bolles, R. C. (1987). Random sample
consensus: a paradigm for model fitting with appli-
cations to image analysis and automated cartography.
In Readings in computer vision, pages 726–740. Else-
vier.
Hartley, R. and Zisserman, A. (2003). Multiple view geome-
try in computer vision. Cambridge university press.
Lee, Y. Y., Park, M. K., Yoo, J. D., and Lee, K. H. (2013).
Multi-scale feature matching between 2d image and
3d model. In SIGGRAPH Asia 2013 Posters, page 14.
ACM.
Lim, J. J., Khosla, A., and Torralba, A. (2014). Fpm: Fine
pose parts-based model with 3d cad models. In Euro-
pean Conference on Computer Vision, pages 478–493.
Springer.
Liu, L. and Stamos, I. (2005). Automatic 3d to 2d registra-
tion for the photorealistic rendering of urban scenes.
In Computer Vision and Pattern Recognition, 2005.
CVPR 2005. IEEE Computer Society Conference on,
volume 2, pages 137–143. IEEE.
Plotz, T. and Roth, S. (2015). Registering images to untex-
tured geometry using average shading gradients. In
Proceedings of the IEEE International Conference on
Computer Vision, pages 2030–2038.
Pl
¨
otz, T. and Roth, S. (2017). Automatic registration of
images to untextured geometry using average shading
gradients. International Journal of Computer Vision,
125(1-3):65–81.
Ramalingam, S., Bouaziz, S., Sturm, P., and Brand, M.
(2009). Geolocalization using skylines from omni-
images. In Computer Vision Workshops (ICCV Works-
hops), 2009 IEEE 12th International Conference on,
pages 23–30. IEEE.
Rashwan, H. A., Chambon, S., Gurdjos, P., Morin, G., and
Charvillat, V. (2016). Towards multi-scale feature
detection repeatable over intensity and depth images.
In Image Processing (ICIP), 2016 IEEE International
Conference on, pages 36–40. IEEE.
Rashwan, H. A., Chambon, S., Gurdjos, P., Morin, G., and
Charvillat, V. (2018). Using curvilinear features in fo-
cus for registering a single image to a 3d object. arXiv
preprint arXiv:1802.09384.
Sattler, T., Leibe, B., and Kobbelt, L. (2011). Fast image-
based localization using direct 2d-to-3d matching. In
Computer Vision (ICCV), 2011 IEEE International
Conference on, pages 667–674. IEEE.
Su, H., Qi, C. R., Li, Y., and Guibas, L. J. (2015). Render for
cnn: Viewpoint estimation in images using cnns trai-
ned with rendered 3d model views. In Proceedings of
the IEEE International Conference on Computer Vi-
sion, pages 2686–2694.
Szeto, R. and Corso, J. J. (2017). Click here: Human-
localized keypoints as guidance for viewpoint estima-
tion. arXiv preprint arXiv:1703.09859.
Tamaazousti, M., Gay-Bellile, V., Collette, S. N., Bour-
geois, S., and Dhome, M. (2011). Nonlinear refine-
ment of structure from motion reconstruction by ta-
king advantage of a partial knowledge of the envi-
ronment. In Computer Vision and Pattern Recogni-
tion (CVPR), 2011 IEEE Conference on, pages 3073–
3080. IEEE.
Tulsiani, S. and Malik, J. (2015). Viewpoints and keypoints.
In Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition, pages 1510–1519.
Xiang, Y., Mottaghi, R., and Savarese, S. (2014). Beyond
pascal: A benchmark for 3d object detection in the
wild. In Applications of Computer Vision (WACV),
2014 IEEE Winter Conference on, pages 75–82. IEEE.
Xu, C., Zhang, L., Cheng, L., and Koch, R. (2017). Pose
estimation from line correspondences: A complete
analysis and a series of solutions. IEEE transacti-
ons on pattern analysis and machine intelligence,
39(6):1209–1222.
Effective 2D/3D Registration using Curvilinear Saliency Features and Multi-Class SVM
361