Authors:
Sridhar Poosapadi Arjunan
1
;
Hans Weghorn
2
;
Dinesh Kant Kumar
1
and
Wai Chee Yau
1
Affiliations:
1
School of Electrical and Computer Engineering, RMIT University, Australia
;
2
Information technology, BA-University of Cooperative Education, Germany
Keyword(s):
HCI, Speech Command, Facial Surface Electromyogram, Artificial Neural Network, Bilingual variation.
Related
Ontology
Subjects/Areas/Topics:
Accessibility to Disabled Users
;
Computer-Supported Education
;
Enterprise Information Systems
;
Human Factors
;
Human-Computer Interaction
;
Machine Perception: Vision, Speech, Other
;
Physiological Computing Systems
;
Ubiquitous Learning
;
User Needs
Abstract:
This research examines the use of fSEMG (facial Surface Electromyogram) to recognise speech commands in English and German language without evaluating any voice signals. The system is designed for applications based on speech commands for Human Computer Interaction (HCI). An effective technique is presented, which uses the facial muscle activity of the articulatory muscles and human factors for silent vowel recognition. The difference in the speed and style of speaking varies between experiments, and this variation appears to be more pronounced when people are speaking a different language other than their native language. This investigation reports measuring the relative activity of the articulatory muscles for recognition of silent vowels of German (native) and English (foreign) languages. In this analysis, three English vowels and three German vowels were used as recognition variables. The moving root mean square (RMS) of surface electromyogram (SEMG) of four facial muscles is use
d to segment the signal and to identify the start and end of a silently spoken utterance. The relative muscle activity is computed by integrating and normalising the RMS values of the signals between the detected start and end markers. The output vector of this is classified using a back propagation neural network to identify the voiceless speech. The cross-validation was performed to test the reliability of the classification. The data is also tested using K-means clustering technique to determine the linearity of separation of the data. The experimental results show that this technique yields high recognition rate when used for all participants in both languages. The results also show that the system is easy to train for a new user and suggest that such a system works reliably for simple vowel based commands for human computer interface when it is trained for a user, who can speak one or more languages and for people who have speech disability.
(More)