performed for each subject. To observe any changes
in muscle activity, the recorded raw EMG signal was
further processed.
After the recording process was completed, the
raw EMG was transferred to Matlab for further
analysis. Using averaging filter, thresholding was
done to remove the noise. The RMS (Root Mean
Square) values of each signal was estimated with ‘s’
the window length being 1.5 s. This window size
was selected as it represented the maximum size of
the envelope for the vowels spoken by the subjects.
4 TESTING
Recognition of EMG based speech features may be
achieved by applying a supervised artificial neural
network. The artificial neural network is efficient
regardless of data quality. Neural networks can learn
from examples and once trained, are extremely fast
making them suitable for real time applications
(Freeman and Skapura, 1991) (Haung, 2001). The
classification by ANN does not require any
statistical assumptions of the data. ANNs learns to
recognize the characteristic features of the data to
classify the data efficiently and accurately.
Back Propagation (BPN) type Artificial Neural
Network has been designed and implemented. The
advantage of choosing Feed Forward (FF) and BPN
learning algorithm architecture is to overcome the
drawback of the standard ANN architecture.
Augmenting the input by hidden context units,
which give feedback to the hidden layer, thus giving
the network an ability of extracting features of the
data from the training events is one advantage. The
size of the hidden layer and other parameters of the
network were chosen iteratively after
experimentation with the back-propagation
algorithm. There is an inherent trade off to be made
more hidden units results in more time required for
each iteration of training; fewer hidden units results
in faster update rate. For this study, two hidden layer
structure were found sufficiently suitable for good
performance but not prohibitive in terms of training
time. Sigmoid has been used as the threshold
function and gradient desent and adaptive learning
with momentum as training algorithm. A learning
rate of 0.02 and the default momentum rate was
found to be suitable for stable learning of the
network. The training stopped when the network
converged and the network error is less than the
target error. The weights and biases of the network
were saved and used for testing the network. The
data was divided into subsets of training, validation,
and test subsets data. One fourth of the data was
used for the validation set, one-fourth for the test set,
and one half for the training set. Three RMS values
of EMG captured during the subject pronounce the
vowels were defined as inputs to the ANN. The
output of the ANN was one of the five vowels.
5 RESULTS AND DISCUSSION
Table 1: Accuracy of recognition of vowel from EMG
/a/ /e/ /i/ /o/ /u/ Average
Subject 1 97 94 98 93 85 93.4
Subject 2 91 86 90 85 93 89
Subject 3 88 89 86 97 95 91
Table 1 shows the experimental results. The results
of the testing show that with the system described
can classify the five vowels with an accuracy of up
to 91%. The higher classification accuracy is due to
better discriminating ability of neural network
architecture and RMS of EMG as the features. At
the present stage, the method has been tested
successfully with only three subjects. In order to
evaluate the intra and inter variability of the method,
a study on a larger experimental population is
required.
6 CONCLUSIONS
This paper describes a study to recognise human
speech signal based on the EMG data extracted from
the three articulatory facial muscles coupled with
neural networks. Test results show recognition
accuracy of 91 %. The system is accurate when
compared to other attempts for EMG based speech
recognition systems. These preliminary results
suggest that the study is suitable to develop a real-
time EMG based speech recognition system. This
would have number of applications such as for voice
control of machines and toys in noisy environment
and for people who do not have the gift of speech. It
would also find other applications such as for noise
reduction for telephonic conversations in noisy
environments.
7 FURTHER WORK
Authors are currently working with a larger
population of subjects to determine the inter and
APPLICABILITY OF FACIAL EMG IN HCI AND VOICELESS COMMUNICATION
381