ROBUST FACE ALIGNMENT USING CONVOLUTIONAL NEURAL NETWORKS

Stefan Duffner, Christophe Garcia

2008

Abstract

Face recognition in real-world images mostly relies on three successive steps: face detection, alignment and identification. The second step of face alignment is crucial as the bounding boxes produced by robust face detection algorithms are still too imprecise for most face recognition techniques, i.e. they show slight variations in position, orientation and scale. We present a novel technique based on a specific neural architecture which, without localizing any facial feature points, precisely aligns face images extracted from bounding boxes coming from a face detector. The neural network processes face images cropped using misaligned bounding boxes and is trained to simultaneously produce several geometric parameters characterizing the global misalignment. After having been trained, the neural network is able to robustly and precisely correct translations of up to ±13% of the bounding box width, in-plane rotations of up to ±30◦ and variations in scale from 90% to 110%. Experimental results show that 94% of the face images of the BioID database and 80% of the images of a complex test set extracted from the internet are aligned with an error of less than 10% of the face bounding box width.

References

  1. Baker, S. and Matthews, I. (2001). Equivalence and efficiency of image alignment algorithms. In Computer Vision and Pattern Recognition, volume 1, pages 1090-1097.
  2. Berg, T., Berg, A., Edwards, J., Maire, M., White, R., Teh, Y.-W., Learned-Miller, E., and Forsyth, D. (2004). Names and faces in the news. In Computer Vision and Pattern Recognition, volume 2, pages 848-854.
  3. Cootes, T., Edwards, G., and Taylor, C. (2001). Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6):681-685.
  4. Duffner, S. and Garcia, C. (2005). A connexionist approach for robust and precise facial feature detection in complex scenes. In Fourth International Symposium on Image and Signal Processing and Analysis (ISPA), pages 316-321, Zagreb, Croatia.
  5. Edwards, G., Taylor, C., and Cootes, T. (1998). Interpreting face images using active appearance models. In Automatic Face and Gesture Recognition, pages 300-305.
  6. Fukushima, K. (1975). Cognitron: A self-organizing multilayered neural network. Biological Cybernetics, 20:121-136.
  7. Garcia, C. and Delakis, M. (2004). Convolutional face finder: A neural architecture for fast and robust face detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(11):1408 - 1423.
  8. Hu, C., Feris, R., and Turk, M. (2003). Active wavelet networks for face alignment. In British Machine Vision Conference, UK.
  9. Hubel, D. and Wiesel, T. (1962). Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. Journal of Physiology, 160:106-154.
  10. Jia, K., Gong, S., and Leung, A. (2006). Coupling face registration and super-resolution. In British Machine Vision Conference, pages 449-458, Edinburg, UK.
  11. LeCun, Y. (1989). Generalization and network design strategies. In Pfeifer, R., Schreter, Z., Fogelman, F., and Steels, L., editors, Connectionism in Perspective, Zurich.
  12. LeCun, Y., Boser, B., Denker, J., Henderson, D., Howard, R., Hubbard, W., and Jackel, L. (1990). Handwritten digit recognition with a back-propagation network. In Touretzky, D., editor, Advances in Neural Information Processing Systems 2, pages 396-404. Morgan Kaufman, Denver, CO.
  13. Li, S., ShuiCheng, Y., Zhang, H., and Cheng, Q. (2002). Multi-view face alignment using direct appearance models. In Automatic Face and Gesture Recognition, pages 309-314.
  14. Martinez, A. (2002). Recognizing imprecisely localized, partially occluded, and expression variant faces from a single sample per class. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(6):748-763.
  15. Moghaddam, B. and Pentland, A. (1997). Probabilistic visual learning for object representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7):696-710.
  16. Mozer, M. C. (1991). The perception of multiple objects: a connectionist approach. MIT Press, Cambridge, USA.
  17. Rentzeperis, E., Stergiou, A., Pnevmatikakis, A., and Polymenakos, L. (2006). Impact of face registration errors on recognition. In Artificial Intelligence Applications and Innovations, Peania, Greece.
  18. Rowley, H. A., Baluja, S., and Kanade, T. (1998). Rotation invariant neural network-based face detection. In Computer Vision and Pattern Recognition, pages 38- 44.
  19. Shan, S., Chang, Y., Gao, W., Cao, B., and Yang, P. (2004). Curse of mis-alignment in face recognition: problem and a novel mis-alignment learning solution. In Automatic Face and Gesture Recognition, pages 314-320.
  20. Wiskott, L., Fellous, J., Krueger, N., and von der Malsburg, C. (1997). Face recognition by elastic bunch graph matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7):775-779.
Download


Paper Citation


in Harvard Style

Duffner S. and Garcia C. (2008). ROBUST FACE ALIGNMENT USING CONVOLUTIONAL NEURAL NETWORKS . In Proceedings of the Third International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2008) ISBN 978-989-8111-21-0, pages 30-37. DOI: 10.5220/0001073200300037


in Bibtex Style

@conference{visapp08,
author={Stefan Duffner and Christophe Garcia},
title={ROBUST FACE ALIGNMENT USING CONVOLUTIONAL NEURAL NETWORKS},
booktitle={Proceedings of the Third International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2008)},
year={2008},
pages={30-37},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001073200300037},
isbn={978-989-8111-21-0},
}


in EndNote Style

TY - CONF
JO - Proceedings of the Third International Conference on Computer Vision Theory and Applications - Volume 1: VISAPP, (VISIGRAPP 2008)
TI - ROBUST FACE ALIGNMENT USING CONVOLUTIONAL NEURAL NETWORKS
SN - 978-989-8111-21-0
AU - Duffner S.
AU - Garcia C.
PY - 2008
SP - 30
EP - 37
DO - 10.5220/0001073200300037