which is called heart murmur. Left untreated, RHD
will compromise the cardiac output of the patient
which will subsequently lead to pre- mature death
(Walsh, 2019). The heart sound wave- form has
distinct features called the first heart sound (S1), the
second heart sound (S2), systole and diastole parts.
Murmur normally presents itself in the systolic or
diastolic parts. The heart sound can be listened to
and recorded using a stethoscope in the form of a
phonocardiograph (PCG). A three second MATLAB
plot of clean, noisy and murmur types of heart sound
is shown in Fig. 1. The clean signal is manually
segmented to locate S1, S2, systole and diastole.
Figure 1: Time series representation of heart sounds: clean
heart sound with S1, S2, Systole and Diastole labelled (top),
noisy heart sound (middle), and heart sound with a murmur
(bottom).
The heart sound gives vital information about
cardiac wellbeing. However, even under ideal condi-
tions, the accuracy of diagnosis is very low (Pelech,
2004). This is in reality attributed to the inherent
limitation of the human auditory system to perform
accurate auscultation. On top of that, the listening
process is highly subjective. This usually forces
doctors to be highly dependent on other expensive
imaging devices like echocardiography and x-ray for
cardiac screening (Vukanovic-Criley JM, 2006).
To counter the subjectivity and the high percent-
age of diagnostic errors, computer-aided diagnostic
(CAD) systems can provide paramount importance
(Bozˇo
Tomas,
2007), (Belloni
and
Spoletini,
2007).
For the successful implementation of CADs, the qual-
ity of the input signal should be high. Such automa-
tion has been researched for over six decades now. In
the 1960s, one of the ground-breaking studies in the
automatic classification of heart sound pathology was
performed by (D. S. Gerbarg and Hofler, 1963). Since
then thousands of research papers have been
published. Some of the prominent works have been
properly investigated in the report published in 2016
by (Liu C1, 2016). This paper demonstrates the im-
portance of a well-characterized dataset for develop-
ing successful classification algorithms. This work
also assembles the largest heart sound dataset.
The 2016 PhysioNet Computing in Cardiology
Challenge was one of the most successful challenges
conducted by the program which attracted a large
number of researchers to solve the heart sound classi-
fication to normal and abnormal. In the competition,
the largest heart sound dataset compiled by (Liu C1,
2016) was provided. The winners of the competi-
tion, Potes et al. (Cristhian Potes, 2016) have devel-
oped a deep-learning-based classifier that combines
time-frequency features with a reported sensitivity of
96%, specificity of 80% and overall accuracy of 89%.
Almost all previously proposed algorithms needed
the segmentation of the heart sound recording into
first heart sound, second heart sound, systole, and di-
astole parts. This is a reasonable assumption which
may lead to pinpointing of abnormalities in the heart
sounds at specific temporal locations. However, the
complexity and also the error introduced in the accu-
rate localization of the segments have decreased the
performance of the algorithms.
Recently, P. Langley and A. Murray (Cristhian
Potes, 2017) have demonstrated the feasibility of
accurate classification without segmentation of the
heart sounds. The paper has a relatively lower overall
accuracy of 79% (specificity 80%, sensitivity 77%)
classification, and claims this is mainly due to the
quality of the dataset used. Despite the sheer volume
of research done in the area, the studies are critically
hampered by the lack of high- quality recordings that
have proper validation and standardization. This
would have created common formatting that allows
collaborative research, large- scale analytics, and
tools and methodologies to be shared. The largest
available open access data set is available which was
compiled by Liu et al. (Liu C1, 2016). It contains 2435
heart sound recordings from 1297 subjects. The
dataset consists of recordings from subjects with a
variety of abnormalities which include heart valve
damage and coronary artery disease. The maximum
overall accuracy reported in the literature by using this
database is only 94% which was achieved by
introducing different model optimization techniques
(Suhm, 2019).
D.B. Springer et al. (D.B. Springer, 2014) have
worked on a dataset that is recorded to classify an
RHD from normal heart sounds. A total of 318
recordings from 106 subjects where 40 were identi-
fied with RHD. Their aim was to detect systolic mur-
mur hence the heart sound is segmented before feature
extraction. A combination of MFCC and wavelet fea-
tures are used. SVM classification algorithm is used
by optimizing its parameters and the procedure is
validated using a 10-fold cross-validation technique.