Subtasks of Unconstrained Face Recognition
Joel Z. Leibo, Qianli Liao, Tomaso Poggio
2014
Abstract
Unconstrained face recognition remains a challenging computer vision problem despite recent exceptionally high results ( ~ 95% accuracy) on the current gold standard evaluation dataset: Labeled Faces in the Wild (LFW). We offer a decomposition of the unconstrained problem into subtasks based on the idea that invariance to identity-preserving transformations is the crux of recognition. Each of the subtasks in the Subtasks of Unconstrained Face Recognition (SUFR) challenge consists of a same-different face-matching problem on a set of 400 individual synthetic faces rendered so as to isolate a specific transformation or set of transformations. We characterized the performance of 9 different models (8 previously published) on each of the subtasks. One notable finding was that the HMAX-C2 feature was not nearly as clutter-resistant as had been suggested by previous publications. Next we considered LFW and argued that it is too easy of a task to continue to be regarded as a measure of progress on unconstrained face recognition. In particular, strong performance on LFW requires almost no invariance, yet it cannot be considered a fair approximation of the outcome of a detection --> alignment pipeline since it does not contain the kinds of variability that realistic alignment systems produce when working on non-frontal faces. We offer a new, more difficult, natural image dataset: SUFR-in-the-Wild (SUFR-W), which we created using a protocol that was similar to LFW, but with a few differences designed to produce more need for transformation invariance. We present baseline results for eight different face recognition systems on the new dataset and argue that it is time to retire LFW and move on to more difficult evaluations for unconstrained face recognition.
References
- Blender.org (2013). Blender 2.6.
- Braje, W., Kersten, D., Tarr, M., and Troje, N. (1998). Illumination effects in face recognition. Psychobiology, 26(4):371-380.
- Chan, C., Tahir, M., Kittler, J., and Pietikainen, M. (2013). Multiscale local phase quantization for robust component-based face recognition using kernel fusion of multiple descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(5):1164-1177.
- Chen, D., Cao, X., Wen, F., and Sun, J. (2013). Blessing of Dimensionality: High-dimensional Feature and Its Efficient Compression for Face Verification. In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).
- Dalal, N. and Triggs, B. (2005). Histograms of oriented gradients for human detection. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, volume 1, pages 886- 893. IEEE.
- DiCarlo, J., Zoccolan, D., and Rust, N. (2012). How does the brain solve visual object recognition? Neuron, 73(3):415-434.
- Felzenszwalb, P. F., Girshick, R. B., McAllester, D., and Ramanan, D. (2010). Object detection with discriminatively trained part-based models. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 32(9):1627-1645.
- Gross, R., Matthews, I., Cohn, J., Kanade, T., and Baker, S. (2010). Multi-pie. Image and Vision Computing, 28(5):807-813.
- Grother, P., Quinn, G., and Phillips, P. (2010). Report on the evaluation of 2d still-image face recognition algorithms. NIST Interagency Report, 7709.
- Guillaumin, M., Verbeek, J., and Chmid, C. (2009). Is that you? Metric learning approaches for face identification. In IEEE International Conference on Computer Vision, pages 498-505, Kyoto, Japan.
- Huang, G. B., Mattar, M., Berg, T., and Learned-Miller, E. (2008). Labeled faces in the wild: A database for studying face recognition in unconstrained environments. In Workshop on faces in real-life images: Detection, alignment and recognition (ECCV), Marseille, Fr.
- Hung, C. P., Kreiman, G., Poggio, T., and DiCarlo, J. J. (2005). Fast Readout of Object Identity from Macaque Inferior Temporal Cortex. Science, 310(5749):863- 866.
- Hussain, S., Napoléon, T., and Jurie, F. (2012). Face recognition using local quantized patterns. In Proc. British Machine Vision Conference (BMCV), volume 1, pages 52-61, Guildford, UK.
- Leibo, J. Z., Mutch, J., Rosasco, L., Ullman, S., and Poggio, T. (2010). Learning Generic Invariances in Object Recognition: Translation and Scale. MIT-CSAIL-TR2010-061, CBCL-294.
- Lowe, D. G. (1999). Object recognition from local scaleinvariant features. In Computer vision, 1999. The proceedings of the seventh IEEE international conference on, volume 2, pages 1150-1157. Ieee.
- Mutch, J., Knoblich, U., and Poggio, T. (2010). CNS: a GPU-based framework for simulating corticallyorganized networks. MIT-CSAIL-TR, 2010-013(286).
- Ojala, T., Pietikainen, M., and Maenpaa, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 24(7):971-987.
- Ojansivu, V. and Heikkilä, J. (2008). Blur insensitive texture classification using local phase quantization. In Image and Signal Processing, pages 236-243. Springer.
- Phillips, P. J., Flynn, P. J., Scruggs, T., Bowyer, K. W., Chang, J., Hoffman, K., Marques, J., Min, J., and Worek, W. (2005). Overview of the face recognition grand challenge. In Computer vision and pattern recognition, 2005. CVPR 2005. IEEE computer society conference on, volume 1, pages 947-954. IEEE.
- Pinto, N., Barhomi, Y., Cox, D., and DiCarlo, J. J. (2011). Comparing state-of-the-art visual features on invariant object recognition tasks. In Applications of Computer Vision (WACV), 2011 IEEE Workshop on, pages 463- 470. IEEE.
- Pinto, N., Cox, D., and DiCarlo, J. J. (2008a). Why is realworld visual object recognition hard? PLoS computational biology, 4(1):e27.
- Pinto, N., DiCarlo, J. J., and Cox, D. (2009). How far can you get with a modern face recognition test set using only simple features? In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pages 2591-2598. IEEE.
- Pinto, N., DiCarlo, J. J., Cox, D. D., et al. (2008b). Establishing good benchmarks and baselines for face recognition. In Workshop on Faces in'Real-Life'Images: Detection, Alignment, and Recognition.
- Poggio, T., Mutch, J., Anselmi, F., Leibo, J. Z., Rosasco, L., and Tacchetti, A. (2012). The computational magic of the ventral stream: sketch of a theory (and why some deep architectures work). MIT-CSAIL-TR-2012-035.
- Riesenhuber, M. and Poggio, T. (1999). Hierarchical models of object recognition in cortex. Nature Neuroscience, 2(11):1019-1025.
- Serre, T., Oliva, A., and Poggio, T. (2007). A feedforward architecture accounts for rapid categorization. Proceedings of the National Academy of Sciences of the United States of America, 104(15):6424-6429.
- Singular Inversions (2003). FaceGen Modeller 3.
- Troje, N. and Bülthoff, H. (1996). Face recognition under varying poses: The role of texture and shape. Vision Research, 36(12):1761-1771.
- van de Sande, K. E. A., Gevers, T., and Snoek, C. G. M. (2011). Empowering visual categorization with the gpu. IEEE Transactions on Multimedia, 13(1):60-70.
- Vedaldi, A. and Fulkerson, B. (2008). VLFeat: An open and portable library of computer vision algorithms.
- Viola, P. and Jones, M. J. (2004). Robust real-time face detection. International Journal of Computer Vision, 57(2):137-154.
- Wolf, L., Hassner, T., and Taigman, Y. (2011). Effective unconstrained face recognition by combining multiple descriptors and learned background statistics. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(10):1978-1990.
- Zhu, X. and Ramanan, D. (2012). Face detection, pose estimation, and landmark localization in the wild. In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pages 2879-2886, Providence, RI.
Paper Citation
in Harvard Style
Z. Leibo J., Liao Q. and Poggio T. (2014). Subtasks of Unconstrained Face Recognition . In Proceedings of the 9th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2014) ISBN 978-989-758-004-8, pages 113-121. DOI: 10.5220/0004694201130121
in Bibtex Style
@conference{visapp14,
author={Joel Z. Leibo and Qianli Liao and Tomaso Poggio},
title={Subtasks of Unconstrained Face Recognition},
booktitle={Proceedings of the 9th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2014)},
year={2014},
pages={113-121},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004694201130121},
isbn={978-989-758-004-8},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 9th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2014)
TI - Subtasks of Unconstrained Face Recognition
SN - 978-989-758-004-8
AU - Z. Leibo J.
AU - Liao Q.
AU - Poggio T.
PY - 2014
SP - 113
EP - 121
DO - 10.5220/0004694201130121