Full Video Processing for Mobile Audio-Visual Identity Verification
Alexander Usoltsev, Dijana Petrovska-Delacrétaz, Khemiri Houssemeddine
2016
Abstract
This paper describes a bi-modal biometric verification system based on voice and face modalities, which takes advantage of the full video processing instead of using still-images. The bi-modal system is evaluated on the MOBIO corpus and results show a relative improvement of performance by nearly 10% when the whole video is used. The fusion between face and speaker verification systems, using linear logistic regression weights, gives a relative improvement of performance that varies between 30% and 60% comparing to the best uni-modal system. Proof-of-concept iPad application is developed based on the proposed bi-modal system.
References
- Bonastre, J., Scheffer, N., Matrouf, D., Fredouille, C., Larcher, A., Preti, A., Pouchoulin, G., Evans, N., Fauve, B., and Mason, J. (2008). Alize/spkdet: a state of-the-art open source software for speaker recognition. In The Speaker and Language Recognition Workshop, Odyssey.
- Cootes, T. F., Taylor, C. J., Cooper, D. H., and Graham, J. (1995). Active shape models their training and application. In Computer Vision and Image Understanding, pages 38-59.
- Gravier, G. (2009). Spro: Speech signal processing toolkit, release 4.1.
- Lowe, D. G. (2000). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60:91-110.
- McCool, C., Marcel, S., Hadid, A., Pietikainen, M., Matejka, P., Cernocky, J., Poh, N., Kittler, J., Larcher, A., Levy, C., Matrouf, D., Bonastre, J.-F., Tresadern, P., and Cootes, T. (2012). Bi-modal person recognition on a mobile phone: Using mobile phone data. In Multimedia and Expo Workshops (ICMEW), 2012 IEEE International Conference on, pages 635- 640.
- Petrovska-Delacrétaz, D., Chollet, G., and Dorizzi, B. (2009). Guide to Biometric Reference Systems and Performance Evaluation. Springer Verlag.
- Reynolds, D., Quatieri, T., and Dunn, R. (2000). Speaker verification using adapted gaussian mixture models. Digital Signal Processing, 10(13):19 - 41.
- Stegmann, M. B., Ersbll, B. K., and Larsen, R. (2003). Fame a flexible appearance modelling environment. IEEE Trans. On Medical Imaging, 22(10):1319- 1331-110.
- Zhou, D., Petrovska-Delacrétaz, D., and Dorizzi, B. (2009). Automatic landmark location with a combined active shape model. In International Conference on Biometrics: Theory, Applications, and Systems, pages 1-7.
- MacLean, K., VoxForge (2012). Ken MacLean. [Online]. Available: http://www.voxforge.org/home.
- Phillips, P. J., Flynn, P. J., Scruggs, T., Bowyer, K. W., Chang, J., Hoffman, K., ... & Worek, W. (2005, June). Overview of the face recognition grand challenge. In Computer vision and pattern recognition, 2005. CVPR 2005. IEEE computer society conference on (Vol. 1, pp. 947-954). IEEE.
Paper Citation
in Harvard Style
Usoltsev A., Petrovska-Delacrétaz D. and Houssemeddine K. (2016). Full Video Processing for Mobile Audio-Visual Identity Verification . In Proceedings of the 5th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-758-173-1, pages 552-557. DOI: 10.5220/0005667305520557
in Bibtex Style
@conference{icpram16,
author={Alexander Usoltsev and Dijana Petrovska-Delacrétaz and Khemiri Houssemeddine},
title={Full Video Processing for Mobile Audio-Visual Identity Verification},
booktitle={Proceedings of the 5th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2016},
pages={552-557},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005667305520557},
isbn={978-989-758-173-1},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 5th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - Full Video Processing for Mobile Audio-Visual Identity Verification
SN - 978-989-758-173-1
AU - Usoltsev A.
AU - Petrovska-Delacrétaz D.
AU - Houssemeddine K.
PY - 2016
SP - 552
EP - 557
DO - 10.5220/0005667305520557