fectively. This is shown in Fig.6. It also goes for the
facial images which are at the large distance in the
image as the pixel information will not be sufficient
because of its small size for the model to work upon.
Hence, contours cannot be obtained successfully in
such images. Therefore, in such cases depth cannot
be estimated by this technique.
The relation given in equation 8 does not apply for
objects at large distances. This explains why the depth
calculated for the facial image in Fig.9 was incorrect.
6 CONCLUSIONS AND FUTURE
WORK
We have developed a novel method to recover the
sparse depth information of the persons whose faces
are present in a given scene. The approach relies on
the ASM features learnt for a given face and there-
fore does not require explicit computation of the fea-
ture detection for extracting the feature points. The
advantage with the proposed approach is that we can
even calculate the depth of the individual facial fea-
tures such as eyes and mouth when the images are
captured with sufficient zoom.
Figure 9: Image of a person standing at a large distance
(around 5 metres) from the camera. The image size is
4608 ×3456 pixels. In such cases, the pixel information
present in the facial region is not significant enough for the
proposed algorithm and also for the Viola-Jones face detec-
tion algorithm.
The comparison of our results with state-of-the-
art feature detection based sparse depth recovery tech-
niques needs to be performed for validation. We plan
to extend this approach to handle scenes which are
captured using low resolution cameras and also per-
sons who are located at much larger distance from
the camera. These challenging situations can be ad-
dressed by using various low level image processing
tools as a pre-processing step before using the pro-
posed algorithm. As the stereo cameras have made
their way into digital camera market, the proposed ap-
proach has the potential to provide information to the
user about the proximity of a person from the camera.
REFERENCES
Barnard, S. T. and Fischler, M. A. (1982). Computational
stereo. ACM Computing Surveys (CSUR), 14(4):553–
572.
Chakraborty, I., Cheng, H., and Javed, O. (2013). 3d vi-
sual proxemics: Recognizing human interactions in
3d from a single image. In IEEE CVPR, CVPR ’13,
pages 3406–3413.
Cootes, T. F., Edwards, G. J., and Taylor, C. J. (2001). Ac-
tive appearance models. Pattern Analysis and Ma-
chine Intelligence, IEEE Transactions on, 23(6):681–
685.
Cootes, T. F., Taylor, C. J., Cooper, D. H., and Graham,
J. (1995). Active shape models-their training and ap-
plication. Computer vision and image understanding,
61(1):38–59.
Hartley, R. and Zisserman, A. (2004). Multiple View Geom-
etry in Computer Vision. Cambridge University Press,
2 edition.
Hoff, W. and Ahuja, N. (1989). Surfaces from stereo: In-
tegrating feature matching, disparity estimation, and
contour detection. Pattern Analysis and Machine In-
telligence, IEEE Transactions on, 11(2):121–136.
Kanade, T. and Okutomi, M. (1994). A stereo matching
algorithm with an adaptive window: Theory and ex-
periment. Pattern Analysis and Machine Intelligence,
IEEE Transactions on, 16(9):920–932.
Marr, D. and Poggio, T. (1971). Cooperative computation
of stereo disparity. Appl. Phys, 42:3451.
Matsumoto, Y. and Zelinsky, A. (2000). An algorithm for
real-time stereo vision implementation of head pose
and gaze direction measurement. In Automatic Face
and Gesture Recognition, 2000. Proceedings. Fourth
IEEE International Conference on, pages 499–504.
IEEE.
Scharstein, D. and Szeliski, R. (2002). A taxonomy and
evaluation of dense two-frame stereo correspondence
algorithms. International journal of computer vision,
47(1-3):7–42.
Tuytelaars, T. and Mikolajczyk, K. (2008). Local invariant
feature detectors: a survey. Foundations and Trends
R
in Computer Graphics and Vision, 3(3):177–280.
Viola, P. and Jones, M. J. (2004). Robust real-time face
detection. International journal of computer vision,
57(2):137–154.
Zhao, W., Chellappa, R., Phillips, P. J., and Rosenfeld, A.
(2003). Face recognition: A literature survey. Acm
Computing Surveys (CSUR), 35(4):399–458.
FacialStereo:FacialDepthEstimationfromaStereoPair
691