Voice Verification System for Mobile Devices based on ALIZE/LIA_RAL

Hussein Sharafeddin, Mageda Sharafeddin, Haitham Akkary

Abstract

The main contribution of this paper is providing an architecture for mobile users to authenticate user identity through short text phrases using robust open source voice recognition library ALIZE and speaker recognition tool LIA_RAL. Our architecture consists of a server connected to a group of subscribed mobile devices. The server is mainly needed for training the world model while user training and verification run on the individual mobile devices. The server uses a number of public random speaker text independent voice files to generate data, including the world model, used in training and calculating scores. The server data are shipped with the initial install of our package and with every subsequent package update to all subscribed mobile devices. For security purposes, training data consisting of raw voice and processed files of each user reside on the user’s device only. Verification is based on a short text-independent as well as text-dependent phrases, for ease of use and enhanced performance that gets processed and scored against the user trained model. While we implemented our voice verification in Android, the system will perform as efficiently in iOS. It will in fact be easier to implement since the base libraries are all written in C/C++. We show that the verification success rate of our system is 82%. Our system provides a free robust alternative to replace commercial voice identification and verification tools and extensible to implement more advanced mathematical models available in ALIZE and shown to improve voice recognition.

References

  1. Alumae, T., Kaljurand, K., 2012. Open and extendable speech recognition application architecture for mobile environments. In SLTU'12, The 2nd International Workshop on Spoken Language Technologies for Under-resourced Languages.
  2. Aronoff, R., 2013. Global fraud loss survey. Communications Fraud Conrol Association. Roseland, NJ.
  3. Bao, P., Pierce, J., Wittkaer, S. , Zhai, S., 2011. Smart Phone Use by Non-Mobile Business Users. In MobileHCI.
  4. Bousquet, P. M., Matrouf,, D., Bonastre, J. F., 2011. Intersession Compensation and Scoring Methods in the i-vectors Space for Speaker Recognition. In 12th Annual Conference of the International Speech Communication Association.
  5. Campbel, J. Jr., 1997. Speaker recognition: a tutorial. In Proceedings of the IEEE. Vol. 85, no. 9, Sept., pp. 1437-1462.
  6. Campbell, W. M., J. P. Campbell, J. P., Reynolds, D. A., Singer, E., Torres-Carrasquillo, P. A., 2006. Support Vector Machines for Speaker and Language Recognition. In Computer Speech & Language, Elsevier, Vol. 20.
  7. Chan, W. N., Zheng, N., Lee, T., 2007. Discrimination Power of Vocal Source and Vocal Tract Related Features for Speaker Segmentation. In IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING.
  8. Chen, Y., Heimark, E., Gligorski, D., 2013. Personal threshold in a small scale text-dependent speaker recognition. In International Symposium on Biometrics and Security Technologies.
  9. CMUSphinx, 2014. Carnegie Melon University. http://cmusphinx.sourceforge.net.
  10. D'Avignon, Laboratoire Informatique, 2011. http://mistral.univ-avignon.fr/index_en.html.
  11. Dehak, N., Dehak, R., Kenny, P., Brummer, N., Ouellet, P., Dumouchel, P., 2009. Support Vector Machines versus Fast Scoring in the Low-Dimensional Total Variability Space for Speaker Verification. In InterSpeech 10th Annual Conference of the International Speech Communication Association.
  12. Faundez-Zanuy, M., 2006. Biometric security technology. In IEEE Aerospace and Electronic Systems Magazine No. 21, pp. 15-26.
  13. Fauve, B., Evans, N., Mason, J., 2008. Improving the performance of text-independent short duration. Odyssey.
  14. Fauve, B.G.B., Matrouf, D., Scheffer, N., Bonastre, J. F., Masin, J. S. D., 2007. State-of-the-Art Performance in Text-Independent Speaker Verification Through Open-Source Software. In IEEE Transactions on Audio Speech and Language Processing.
  15. Gish, H., Schmidt, M., 1994. Text-independent speaker identification. In IEEE Signal Processing Magazine, Vol. 1, pp. 18-32.
  16. Kenney, P., Boulianne, G., Ouellet, P., Dumouchel, P., 2007. Joint Factor Analysis versus Eigenchannels in Speaker Recognition. In IEEE Transactions on Audio, Speech, and Language Processing, Vol. 15.
  17. Larcher, A., et al., 2013. ALIZE 3.0 - Open source toolkit for state-of-the-art speaker recognition. In Interspeech.
  18. Lu, H., Brush, A., Priyantha, B., Karlson, A., Liu, J., 2011. SpeakerSense: Energy Efficient Unobtrusive Speaker Identification on Mobile Phones. In The Ninth International Conference on Pervasive Computing.
  19. Mokhov, S., Clement, I., Sinclair, S., Nicolacopoulos, D., 2002. Modular Audio Recognition Framework. Department of Computer Science and Software Engineering, Concordia University. http://marf.sourceforge.net.
  20. NIST, 2014, www.nist.gov.
  21. Petrovska-Delacrétaz, D., Chollet, G., Dorizzi, B., Jain, A., 2009. Guide to Biometric Reference Systems and Performance Evaluation. Springer.
  22. Rabiner, L., Schafer, R., 2010. Theory and Applications of Digital Speech Processing 1st (first) Edition . Prentice Hall.
  23. Reynolds, D. A., Rose, R., 1995. Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models. In IEEE Transactions on Speech and Audio Processing, Vol. 3, No. 1.
  24. Sharafeddin, M., Sharafeddin, H, Akkary, H., 2014. Android-Voice-IDentification-App-using-SPROALIZE-LIARAL. github.com/umbatoul/AndroidVoice-IDentification-App-using-SPRO-ALIZELIARAL.
  25. Srikanth, N, Hegde, R. M., 2010. On line client-wise cohort set selection for speaker verification using iterative normalization of confusion matrices. In EUSIPCO European Signal Processing Conference, pp. 576-580.
  26. Spro, 2004. http://www.irisa.fr/metiss/guig/spro/.
  27. Trewin, S., Swart, C., Koved, L., Matino, J., Singh, K., Ben-Davic, J., 2012. Biometric Authentication on a Mobile Device: A Study of User Effort, Error, and Task Disruption. In ACSAC'12 Proceedings of the 28th Annual Computer Security Applications Conference pp. 159-168.
Download


Paper Citation


in Harvard Style

Sharafeddin H., Sharafeddin M. and Akkary H. (2015). Voice Verification System for Mobile Devices based on ALIZE/LIA_RAL . In Proceedings of the International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM, ISBN 978-989-758-077-2, pages 248-255. DOI: 10.5220/0005218902480255


in Bibtex Style

@conference{icpram15,
author={Hussein Sharafeddin and Mageda Sharafeddin and Haitham Akkary},
title={Voice Verification System for Mobile Devices based on ALIZE/LIA_RAL},
booktitle={Proceedings of the International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM,},
year={2015},
pages={248-255},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005218902480255},
isbn={978-989-758-077-2},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Pattern Recognition Applications and Methods - Volume 2: ICPRAM,
TI - Voice Verification System for Mobile Devices based on ALIZE/LIA_RAL
SN - 978-989-758-077-2
AU - Sharafeddin H.
AU - Sharafeddin M.
AU - Akkary H.
PY - 2015
SP - 248
EP - 255
DO - 10.5220/0005218902480255