of the landmarks in a coarse to fine manner using a
series of cascaded predictors, conferring robustness to
the approach. Indeed predicting landmarks indepen-
dently results in high precision since failure to find the
good location of one of the landmarks does not prop-
agate to the others. The regressors at each level of the
cascade are based on gradient boosting. Three kinds
of weak regressors have been assessed: linear regres-
sors, non-parametric regressors and regression trees.
The gradient boosted trees have the best performance.
This simple scheme has proved to be very efficient
compared to other tested approaches in terms of loca-
tion errors. This approach is also very fast: it takes
8 milliseconds to compute the locations of 20 land-
marks (not counting the computation of the integral
image which is typically required for the detection of
the face).
As possible extensions of the approach, we could
consider applying a post-processing to the predicted
landmarks by enforcing shape consistency (Bel-
humeur et al., 2011). An attractive capability of our
model is to make it possible to trade precision against
speed by traversing only a suitable number of levels
of the cascade.
We believe that this generic approach could be ap-
plied to other problems involving regression where
features derive from measurements from the signal
e.g., to detection and localization of more generic ob-
jects using part based models.
ACKNOWLEDGEMENTS
This work was partially funded by the QUAERO
project supported by OSEO and by the European in-
tegrated project AXES.
REFERENCES
Belhumeur, P. N., Jacobs, D. W., Kriegman, D. J., and Ku-
mar, N. (2011). Localizing parts of faces using a con-
sensus of exemplars. In The 24th IEEE Conference on
Computer Vision and Pattern Recognition (CVPR).
Cao, X., Wei, Y., Wen, F., and Sun, J. (2012). Face aligne-
ment by explicit shape regression - to appear. In Proc.
of CVPR’12.
Cootes, T. F., Edwards, G. J., and Taylor, C. J. (2001). Ac-
tive appearance models. IEEE Transactions Pattern
Analysis and Machine Intelligence, 23(6):681–685.
Cootes, T. F., Taylor, C. J., Cooper, D. H., and Graham, J.
(1995). Active shape models – their training and ap-
plication. Computer Vision and Image Understanding,
61(1):38–59.
Cristinacce, D. and Cootes, T. (2008). Automatic feature
localisation with constrained local models. Pattern
Recognition, 41(10):3054–3067.
Dantone, M., Gall, J., Fanelli, G., and Van Gool, L. (2012).
Real-time facial feature detection using conditional
regression forests. In Computer Vision and Pattern
Recognition (CVPR).
Doll
´
ar, P., Welinder, P., and Perona, P. (2010). Cascaded
pose regression. In Computer Vision and Pattern
Recognition (CVPR), pages 1078–1085.
Everingham, M., Sivic, J., and Zisserman, A. (2006). Hello!
my name is. . . Buffy – Automatic naming of charac-
ters in TV video. In Proceedings of the British Ma-
chine Vision Conference, volume 2.
Friedman, J. H. (2001). Greedy function approximation:
A gradient boosting machine. Annals of Statistics,
29(5):1189–1232.
Jesorsky, O., Kirchberg, K. J., and Frischholz, R. (2001).
Robust Face Detection using the Hausdorff distance.
In AVBPA, pages 90–95.
Kasi
´
nski, A., Florek, A., and Schmidt, A. (2008). The PUT
face database. Image Processing and Communica-
tions, 13(3):59–64.
Kass, M., Witkin, A., and Terzopoulos, D. (1988). Snakes:
Active contour models. International Journal of Com-
puter Vision, 1(4):321–331.
Lanitis, A., Taylor, C. J., and Cootes, T. F. (1997). Auto-
matic interpretation and coding of face images using
flexible models. IEEE Transactions Pattern Analysis
and Machine Intelligence, 19(7):743–756.
Milborrow, S., Morkel, J., and Nicolls, F. (2010). The
MUCT Landmarked Face Database. Pattern Recog-
nition Association of South Africa.
U
ˇ
ri
ˇ
c
´
a
ˇ
r, M., Franc, V., and Hlav
´
a
ˇ
c, V. (2012). Detector of
facial landmarks learned by the structured output svm.
In Proceedings of the 7th International Conference on
Computer Vision Theory and Applications. VISAPP
’12.
Valstar, M., Martinez, B., Binefa, X., and Pantic, M. (2010).
Facial point detection using boosted regression and
graph models. In Proceedings of IEEE Int’l Conf.
Computer Vision and Pattern Recognition (CVPR’10),
pages 2729–2736, San Francisco, USA.
Viola, P. A. and Jones, M. J. (2001). Rapid object detection
using a boosted cascade of simple features. In Com-
puter Vision and Pattern Recognition (CVPR), pages
511–518.
Vukadinovic, D. and Pantic, M. (2005). Fully automatic fa-
cial feature point detection using gabor feature based
boosted classifiers. In Proceedings of IEEE Int’l
Conf. Systems, Man and Cybernetics (SMC’05), pages
1692–1698, Waikoloa, Hawaii.
FacialLandmarksLocalizationEstimationbyCascadedBoostedRegression
519