frontal views by their reliance on a frontal face detec-
tor, so are inappropriate for the Helen set (Le et al.,
2012) with its wide range of poses, but our imple-
mentation fares significantly better.
Figure 7 compares search times. Our implemen-
tation is faster than Stasm. The (Cootes et al., 2012)
Random Forest technique (not shown in Figure 7) is
faster than ours, but they fit only 17 points (we fit 76).
Details. The HatAsm and Stasm 3.1 curves in-
clude all faces in the BioID (Jesorsky et al., 2001),
XM2VTS (Messer et al., 1999), and PUT (Kasinski
et al., 2008) sets, and the designated test of the Helen
set (Le et al., 2012). Faces not found by the face de-
tectors are included in these curves, with an me17 of
infinity. For the PUT set several extra points needed
for calculating me17s were manually added before
testing began. For the Helen set we approximated the
pupil and nose points needed for me17s from neigh-
boring points, introducing noise to the results (but
equally for both implementations). The Belhumeur et
al. and Cootes et al. curves were transcribed from fig-
ures in their papers. All other curves were created by
running software on a local machine.
Publication Note. Techniques in this paper have
now been integrated into Stasm to form Stasm Ver-
sion 4.0. Documented source code is available at
www.milbo.users.sonic.net/stasm.
8 DISCUSSION AND FUTURE
DIRECTIONS
We have shown that HAT descriptors together with
MARS work well with ASMs. HAT descriptors out-
perform gradient based descriptors. HAT descriptors
with MARS bring significant computational advan-
tages over SIFT descriptors with Mahalanobis dis-
tances or SVMs.
An obvious next step would be to investigate other
modern descriptors such as GLOH (Mikolajczyk and
Schmid, 2005), SURF (Bay et al., 2006), or HOG
(Dalal and Triggs, 2005) descriptors (our HAT de-
scriptors are the same as one variant of HOGs, R-
HOGs). Evidence from other domains indicate that
such alternatives per se may not give fit improvements
over HATs.
In (Milborrow et al., 2013) we extend the model
to non-frontal faces.
In recent years researchers have paid considerable
attention to improving the way the template and shape
models work together. Instead of the rigid separation
between template matching and the shape model of
the classical ASM, one can build a combined model
that jointly optimizes the template matchers and shape
constraints. An early example is the Constrained Lo-
cal Model of (Cristinacce and Cootes, 2006). An
informative taxonomy is given in (Saragih et al.,
2010). Such an approach would probably improve
HAT based landmarkers. The advantages of HATs
are diminished by the classical ASM shape model (the
improvement of HATS over square gradient descrip-
tors is significantly larger before the shape constraints
are applied). The match response surfaces over the
search regions are smoother for HATs than for square
gradient descriptors, and this might ease the difficult
optimization task.
REFERENCES
Bay, H., Tuytelaars, T., and Gool, L. V. (2006). SURF:
Speeded Up Robust Features. ECCV.
Belhumeur, P. N., Jacobs, D. W., Kriegman, D. J., and Ku-
mar, N. (2011). Localizing Parts of Faces Using a
Consensus of Exemplars. CVPR.
Belongie, S., Malik, J., and Puzicha., J. (2002). Shape
Matching and Object Recognition Using Shape Con-
texts. PAMI.
Breiman, L. (2001). Random Forests. Machine Learning.
Breiman, L., Friedman, J. H., Olshen, R. A., and Stone,
C. J. (1984). Classification and Regression Trees.
Wadsworth.
Castrill
´
on Santana, M., D
´
eniz Su
´
arez, O.,
Hern
´
andez Tejera, M., and Guerra Artal, C. (2007).
ENCARA2: Real-time Detection of Multiple Faces at
Different Resolutions in Video Streams. Journal of
Visual Communication and Image Representation.
C¸ eliktutan, O., Ulukaya, S., and Sankur, B. (2013). A
Comparative Study of Face Landmarking Techniques.
EURASIP Journal on Image and Video Processing.
http://jivp.eurasipjournals.com/content/2013/1/13/
abstract. This study used Stasm Version 3.1.
Cootes, T., Ionita, M., Lindner, C., and Sauer, P. (2012). Ro-
bust and Accurate Shape Model Fitting using Random
Forest Regression Voting. ECCV.
Cootes, T. F. and Taylor, C. J. (1993). Active Shape Model
Search using Local Grey-Level Models: A Quantita-
tive Evaluation. BMVC.
Cootes, T. F. and Taylor, C. J. (2004). Technical Re-
port: Statistical Models of Appearance for Computer
Vision. The University of Manchester School of
Medicine. http://www.isbe.man.ac.uk/∼bim/Models/
app
models.pdf.
Cootes, T. F., Taylor, C. J., Cooper, D. H., and Graham, J.
(1995). Active Shape Models — their Training and
Application. CVIU.
Cristinacce, D. and Cootes, T. (2006). Feature De-
tection and Tracking with Constrained Local Mod-
els. BMVC. mimban.smb.man.ac.uk/publications/
index.php.
VISAPP2014-InternationalConferenceonComputerVisionTheoryandApplications
386