3 CLASSIFICATION
ALGORITHMS
In the experiment, three classifiers – Random Forest,
kNN, and SVM are used to investigate the
performance of the extracted input parameters in
differentiating the audio sound patterns. Each of the
classifiers represents a category of classification
algorithms often used in Machine Learning.
Whereas the SVM is a non-probabilistic binary
classifier that favours fewer classes, k-NN is an
instance-based algorithm that uses the similarity
measures of the audio features to find the best match
for a given new instance; while Random Forest is an
ensemble algorithm that leverages the desirable
potentials of ‘weaker’ models for better predictions.
We compare the discrimination abilities of the
classifiers using both individual domain feature set
and combined domain feature set. The classification
process involves the following steps:
3.1 Feature Selection
Best of the discriminatory audio features were
selected using two attribute selection algorithms
namely – Correlation Feature Selection (CFS) and
Principal Components Analysis (PCA). The original
feature set consists of 13 attributes as highlighted in
Table 1. However, the best first three features
selected by CFS were varRMS, stdZCR and varSB;
while the highest-ranking features according to PCA
were meanRMS, armRMS, meanSF, stdZCR and
varSF. This gives a total of 7 attributes in the
selected feature set. It is interesting to note that the
three features selected by CFS were good
representation of the audio properties we considered
earlier in the study. Whereas varRMS provides
information on the energy level of the audio signal,
stdZCR shows the periodicity, while varSB
represents the spread or flatness of the audio spectral
shape in terms of frequency localization.
3.2 Training and Testing
A smartphone-based classification model was built
for recognition and discriminating of respiratory
signals with related sound features. The experimen-
tal processes – STFT, Feature Extraction and
Classification were carried out on Android Studio
1.5.1 Integrated Development Environment (IDE).
With embedded Weka APIs, the classifier models
were programmatically trained on the mobile
devices running on Android 4.2.2 and 5.1.1, which
were also used to record some of the audios used to
evaluate the performance of the algorithms in real-
time. We opted to train the models directly on the
mobile devices rather than porting desktop-trained
models, due to serialization and compatibility issues
with android devices. Moreover, the response time
of building the model on the smartphone is faster
compared to the performance on the desktop. The
machine learning algorithms are trained by using the
statistical window-level features obtained from the
audio signal frames. Due to limited datasets, a
‘leave-one-out’ strategy for 10-fold cross validation
was used in the training and evaluation of the
performance of the classifiers and the selected
features. Statistical metrics used in the performance
evaluation were precision, recall and F-measure.
4 RESULTS
In this section, we discuss the results and
performance of the machine learning algorithms in
different scenarios. We also benchmark the real-time
performance of the mobile device in terms of CPU
and memory usage as well as execution/response
time of each of the modules in the entire process.
4.1 Performance of the Classifiers
In the evaluation of the classification process, we
presented different scenarios of the problem to the
classifiers, to understand the mechanisms of their
performances. First, we used two categories of
datasets – 2.5 seconds length and 5 seconds length
of the audio symptoms. The 2.5s length dataset has
a total of 163 records (Wheeze = 49, Stridor = 33,
Cough = 27, Clear-Throat = 26, Other = 28), while
the 5s dataset used in the classification consists of 99
instances in total. Though there were fewer instances
in the 5s datasets, the algorithms performed better on
this category than in 2.5s datasets as shown in Table
2. This implies that longer audio durations rather
than the number of instances provided the classifiers
with more information to learn about the audio
patterns.
Scaling the number of classes used in the
classification and adjustment of the algorithms’
parameters also had much impact on the
performance of the classifiers. From Table 2, we
observed that the SVM classifier performed much
better when we reduce the number of symptom
classes to two; and by increasing the complexity
parameter C, from 1.0 to 3.0, the classifier
performance improved by 4.6%. The kNN algorithm