Classifying Heart Sounds

Approaches to the PASCAL Challenge

Elsa Ferreira Gomes

1,4

, Peter J. Bentley

, Miguel Coimbra

, Emanuel Pereira

and Yiqi Deng

GECAD - Knowledge Engineering Decision Support, Institute of Engineering, Polytechnic of Porto, Porto, Portugal

Dept. of Computer Science, UCL, London, U.K.

Instituto de Telecomunicac¸

oes, Faculdade de Ci

encias da Universidade do Porto, Porto, Portugal

Institute of Engineering, Polytechnic of Porto, Porto, Portugal

Keywords:

Classifying Heart Sounds, Segmentation, Feature Construction.

Abstract:

In this paper we describe a methodology for heart sound classiﬁcation and results obtained at PASCAL Clas-

sifying Heart Sounds Challenge. The results of competing methodologies are shown. The approach has two

steps: segmentation and classiﬁcation of heart sounds. We also describe the data collection procedure.

1 INTRODUCTION

This paper describes the winning approach for the

PASCAL Classifying Heart Sounds Challenge. The

tasks proposed in this Challenge aim to identify car-

diac pathologies by analyzing the features of heart-

beat collected from digital stethoscope and from mo-

bile devices. The main components of heart sound

signal of a normal heart are the ﬁrst heart sound, S1

(or lub), corresponding to the systolic period, and the

second heart sound, S2 (or dub), the diastolic period

(Gupta et al., 2007). This challenge is composed by

Challenge 1 (Heart Sound Segmentation) and Chal-

lenge 2 (Heart Sound Classiﬁcation). Attempts to seg-

ment phonocardiographic (PCG) signals have been

reported in literature. The majority of them exploit

electrocardiogram (ECG) signals or/and carotid pulse

data. For example, Groch presented a solution where

the segmentation was based on the time domain char-

acteristics of the signal (Groch et al., 1992). Strunic

extracted signals on a certain band to reduce anoma-

lies and then set an amplitude threshold to pick out the

spikes and perform the segmentation (Strunic et al.,

2007). To achieve classiﬁcation, Karraz extracted

the QRS complex from the signal as features and

used them in a Neural Network Classiﬁer based on

a Bayesian framework (Karraz and Magenes, 2006).

Strunic integrated all the segmented heart cycles into

one average heart cycle and used it to train the Artiﬁ-

cial Neural Network (ANN) to classify heartbeat into

categories. Kampouraki used Support Vector Machin-

es (SVMs) to classify ECG recordings. However,

real life data, with varying durations and background

noise, is very challenging (Kampouraki et al., 2009).

To cater to demands from such data, Liang chose

Chebyshev type I low-pass ﬁlter combined with Shan-

non energy to attenuate noise and make the ﬁndings

of low intensity sounds, namely heartbeats, easier

(Liang et al., 1997).

2 DATASETS

Dataset A comprises data crowd-sourced from the

general public via the iStethoscope Pro iPhone app

(Figure 1 - left). iStethoscope Pro is an iOS app

which enables members of the public to use their

iOS smart phone to listen to their hearts (Palm et al.,

2010). The app exploits the excellent audio capabili-

ties of today’s mass market devices, performing real-

time ﬁltering and ampliﬁcation, and enabling users to

view FFT spectrograms and email 8 seconds of au-

dio. The quality of the audio as assessed by the car-

diologists is as good as or better than commercially

available digital stethoscopes. Dataset B consists of

more than 200 auscultations gathered using the DigiS-

cope Collector system (Figure 1 - right) deployed in

the Maternal and Fetal Cardiology Unit of the Real

Hospital Portugu

es (RHP) in Recife, Brazil (Pereira

et al., 2011). Each auscultation consists of 6 to 10

seconds recorded for each of the four standard cardiac

auscultation spots in children, which resulted in a to-

337

F. Gomes E., J. Bentley P., Coimbra M., Pereira E. and Deng Y..

Classifying Heart Sounds - Approaches to the PASCAL Challenge.

DOI: 10.5220/0004234403370340

In Proceedings of the International Conference on Health Informatics (HEALTHINF-2013), pages 337-340

ISBN: 978-989-8565-37-2

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

tal of 656 audio ﬁles provided for the challenge. All

relevant patient and auscultation information was an-

notated by clinicians from RHP using the DigiScope

Collector system, including the presence of abnormal

sounds such as murmurs. Each individual heartbeat

was manually segmented.

Figure 1: iStethoscope and the DigiScope Collector system.

3 CHALLENGE 1 - HEART

SOUND SEGMENTATION

In the ﬁrst challenge we aim to produce a method for

determining the location of S1 and S2 sounds within

audio data, segmenting the Normal audio existing

ﬁles in Dataset A and Dataset B. The recorded signals

were ﬁrst preprocessed before performing segmenta-

tion. The original signal was decimated, using the

decimate function of Matlab (MATLAB, 2010) with

factor 5. Then, a band-pass ﬁlter was applied. Con-

sidering the frequency components of S1 and S2 heart

sounds, the chosen ﬁlter was a ﬁfth order Chebyshev

type I low pass ﬁlter with cutoff from 100 Hz to 882

Hz. Then, the signals were normalized to the abso-

lute maximum of the signal (Liang, 1997). After pre-

processing, we calculated the Shannon Envelope of

the normalized signal. Then, the Average Shannon

Energy is calculated in continuous 0.02 seconds win-

dows throughout the signal with 0.01 second overlap-

ping (Liang et al., 1997). After obtaining the normal-

ized average Shannon energy curve we identiﬁed the

peaks. For that, we adapted the open source func-

tion peakdet (Billauer, 2011), written in Matlab. This

function ﬁnds the local maxima (and minima) using

the strategy that a point is considered a maximum

peak if it has a locally maximal value, and was pre-

ceded (to the left) by a value lower than a given delta.

We have used two parameters of the function. The

ﬁrst is the vector to examine, and the second is the

peak gap threshold (the delta). In our case, consid-

ering this delta on the y values was not enough. We

also had to control the distance between peaks on the

x-axis, because we know we cannot have two heart

sounds too close. Thus, we have changed the func-

tion so that a local maximum is considered a peak

if the distance to the nearest peak is greater than a

second threshold. Using this, we segmented almost

all heart sounds. However, we also need to distin-

guish between S1 and S2, making the correct corre-

spondence to each peak. Our current approach for

S1/S2 discrimination is still unsatisfactory. First, we

tried to perform the detection of S1 and S2 sounds

based on the fact that the distance from S2 to S1 is

longer than from S1 to S2, for normal heart rates (Ku-

mar et al., 2006). Bearing this in mind we tried to

pick each heart cycle and the corresponding systolic

interval. The duration of S1 to S2 segments, or the

distance between S1 and S2, was calculated and com-

pared for every segment (Gupta et al., 2007). The

longest interval between two sounds was considered

to correspond to the diastolic period and the sound at

the right side was assigned as S1 and the sound at the

left side was assigned as S2. Unfortunately, we ﬁnd

that those intervals vary widely from ﬁle to ﬁle, in our

datasets. This happens because there are very differ-

ent kinds of heart sound data, for both datasets. In

Figure 2 we can see the result of our method for the

peak detection. In the 1st chart we have the original

signal. In the 2nd chart we have the decimated signal.

In the 3rd chart we have decimated signal ﬁltered with

a Chebyshev ﬁlter. In the 4th chart we have the enve-

lope of the signal and peaks. In the 5th chart we have

the peaks over the original signal. For this part of the

challenge, the other two teams at the ﬁnal, the Stan-

ford (Stanford, 2012) and UCL (Deng and Bentley,

2012) teams, used approaches based on wavelet de-

composition and spectrogram analysis. To reject the

extra peaks, the two teams used the intervals between

each adjacent peak (Bentley et al., 2011). The Stan-

ford team uses Shannon energy for the peaks ﬁnd-

ing. They have used the open source function peak-

ﬁnder.m, written in Matlab (Yoder, 2009). The results

were evaluated on a provided validation set with the

correct locations of S1 and S2 sounds. This set con-

tained the segmentation for sounds of the normal cat-

egory from Dataset A and Dataset B. A test set for

ﬁnal evaluation was also available. This set contained

hidden locations but provided the ﬁnal evaluation re-

sults. The total error δ, is calculated by δ =

∑

k=1

(Eq. 1).

∑

i=1

(|RS1

− T S1

| + |RS2

− T S2

(1)

In this equation, δ

is the average distance of the

k-th sound clip in a dataset;N

is the total number of

S1 and S2 in the k-th sound clip; RS1

(RS2

) indi-

cates the real location of S1(S2) of the i-th heatbeat

and T S1

(T S2

) indicates the calculated location of

S1(S2) of the i-th heatbeat. j is the total of all the

HEALTHINF2013-InternationalConferenceonHealthInformatics

338

sound clips in the speciﬁc dataset. For Dataset B, we

obtained a total error of 72242.8 and the other teams

obtained 75569.8 (UCL) and 76444.4 (Stanford). For

Dataset A the error is higher for all the teams (Bentley

et al., 2011). However, in Dataset A Stanford was the

best, followed by UCL.

Figure 2: Peak detection on heart sound signal.

4 CHALLENGE 2 - HEART

SOUND CLASSIFICATION

The task of Challenge 2 is to produce a method that

can classify the real heart audio into one of four

classes for Dataset A (Normal, Murmur, Extra Heart

Sound and Artifact) and three for Dataset B (Normal,

Murmur and Extra systole). This phase involves fea-

ture construction and selection and the goal of this

phase of the challenge is to label correctly the sounds

provided. After the pre-processing and segmentation

of the heart sound signal, some features were ex-

tracted. Currently, we are using six features. Four

of them were extracted from the distances between

S1 and S2 (peaks). Assuming that sS1 corresponds to

smaller segments and sS2 to the others, the ﬁrst fea-

ture is the ratio of the standard deviation of sS1 over

the whole standard deviation. The second is similar

for sS2. The third and fourth features are the ratio of

the mean of sS1 (sS2 respectively) over total mean.

The ﬁfth feature, Rmedian, is the ratio of the median

of the (three) largest segments in the sample over to-

tal mean. The sixth feature, R2, is the r square of the

array of the sorted segments of the sample (a measure

of linearity). After obtaining the features we used two

different methods from the Weka data mining suite

(Witten and Frank, 2005): J48, which generates deci-

sion trees, and MLP, the Multi Layer Perceptron.

In Challenge 2, we assess our classiﬁcation ap-

proach using three metrics (per dataset) calculated

from the tp (true positives), fp (false positives), tn

(true negaties) and fn (false negatives) values. The

metrics are precision per class, the Youden’s Index,

the F-score (only for Dataset A) and the Discriminant

Power (only for Dataset B). Precision gives us the

positive predictive value (the proportion of samples

that belong in category c that are correctly placed in

category c). Youden’s Index has been used to com-

pare diagnostic abilities of two tests, by evaluating

the algorithm’s ability to avoid failure. In Dataset

A, Youden’s Index is evaluated for Artifact category.

In Dataset B the Youden’s Index is calculated for

the problematic heartbeats. In Table 1 and Table

2, we present the results for Dataset A and Dataset

B, obtained by our approach after applying the J48

and MLP methods. We also present the results ob-

tained by the UCL team. They focus on the num-

ber of heartbeats and features of systole and diastole

period, namely on the length of the peak sequence

before extra-peak-rejection, the length of ﬁnally se-

lected peak sequence after rejection, mean of the sys-

tole and diastole period, the standard deviation of di-

astole period and systolic period. As in Challenge

1, a test set was provided where we could test our

method’s effectiveness on the unlabelled set.

Table 1: Challenge 2 evaluation for Dataset A.

J48 MLP UCL

Precision of Normal 0.25 0.35 0.46

Precision of Murmur 0.47 0.67 0.31

Precision of Extra sound 0.27 0.18 0.11

Precision of Artifact 0.71 0.92 0.58

Artifact Sensitivity 0.63 0.69 0.44

Artifact Speciﬁcity 0.39 0.44 0.44

Youden Index of Artifact 0.01 0.13 -0.09

F-Score 0.20 0.20 0.14

Total Precision 1.71 2.12 1.47

Table 2: Challenge 2 evaluation for Dataset B.

J48 MLP UCL

Precision of Normal 0.72 0.70 0.77

Precision of Murmur 0.32 0.30 0.37

Precision of Extrasystole 0.33 0.67 0.17

Heart problem Sensivity 0.22 0.19 0.51

Heart problem Speciﬁcity 0.82 0.84 0.59

Youden Index of Artifact 0.04 0.02 0.1

Discriminant Power 0.05 0.04 0.09

Total Precision 1.37 1.67 1.31

As we can see in Table 2, our method has prob-

ClassifyingHeartSounds-ApproachestothePASCALChallenge

339

lems in classifying the non-normal heartbeats, for

Dataset B. In Dataset A, the normal class is one of the

most difﬁcult (Table 1). However, we think we can

improve our method by improving the correct identiﬁ-

cation of S1 and S2 in the segmentation and by ﬁnding

new features that take advantage of this identiﬁcation.

5 CONCLUSIONS

In this paper, we present the methodology that won

the Classifying Heart Sounds PASCAL Challenge.

We proposed an algorithm for S1 and S2 heart sound

identiﬁcation (without ECG reference). The segmen-

tation is accomplished by using the envelope of Shan-

non energy and an algorithm for peak detection. De-

spite of the good performance for the correct detec-

tion of S1 and S2 sounds in the signal, we need to

improve the criteria for identifying S1 and S2 (which

is which). After the segmentation, we used J48 and

MLP algorithms (using Weka) to train and classify the

computed features into Normal, Murmur or Extra sys-

tole for Dataset B and Normal, Murmur, Extra sound

and Artifact for Dataset A. We also compare results

obtained by the other two teams present at the ﬁnal of

the competition with ours. Stanford obtained the best

results (for Dataset A) on Challenge 1 but did not pro-

vide an answer for Challenge 2. Our method, as well

as the method followed by the UCL team worked bet-

ter for Dataset B (the dataset with less noise) than for

Dataset A. We think these approaches and this com-

parative study provide a good basis for further anal-

ysis of the heart sound signals. In Challenge 2, our

approach with MLP had the highest total precision.

Nevertheless, the UCL team performed better in some

of the partial success measures.

ACKNOWLEDGEMENTS

We would like to acknowledge the ﬁnancial sup-

port of Fundac¸

ao para a Ci

encia e Tecnologia for

the DigiScope project with reference PTDC/EIA-

CCO/100844/2008. We also thank the PASCAL Net-

work of Excellence for supporting the Classifying

Heart Sounds Challenge and Workshop.

REFERENCES

Bentley, P., Nordehn, G., Coimbra, M., and Man-

nor, S. (2011). The PASCAL Classifying Heart

Sounds Challenge 2011 (CHSC2011) Results.

www.peterjbentley.com/heartchallenge.

Billauer (2011). www.billauer.co.il/peakdet.html.

Deng, Y. and Bentley, P. J. (2012). A robust

heart sound segmentation and classiﬁcation algo-

rithm using wavelet decomposition and spectro-

gram. http://www.peterjbentley.com/heartworkshop/

challengepaper3.pdf.

Groch, M. W., Domnanovich, J. R., and Erwin, W. D.

(1992). A new heart-sounds gating device for medical

imaging. IEEE Transactions on Biomedical Engineer-

ing, 39:307–310.

Gupta, C. N., Palaniappan, R., Swaminathan, S., and Kr-

ishnan, S. M. (2007). Neural network classiﬁcation of

homomorphic segmented heart sounds. Applied Soft

Computing, 7:286–297.

Kampouraki, A., Manis, G., and Nikou, C. (2009). Heart-

beat Time Series Classiﬁcation With Support Vector

Machines. IEEE Transactions on Information Tech-

nology in Biomedicine, 13:512–518.

Karraz, G. and Magenes, G. (2006). Automatic Classi-

ﬁcation of Heartbeats using Neural Network Classi-

ﬁer based on a Bayesian Framework. In Annual In-

ternational Conference of the IEEE Engineering in

Medicine and Biology Society, pages 4016–4019.

Kumar, D., Carvalho, R., Antunes, M., Gil, R., Henriques,

J., and Eugenio, L. (2006). A New Algorithm for De-

tection of S1 and S2 Heart Sounds. In International

Conference on Acoustics, Speech, and Signal Process-

ing, volume 2.

Liang, H., Lukkarinen, S., and Hartimo, I. (1997). Heart

sound segmentation algorithm based on heart sound

envelogram. Proceedings of The IEEE, pages 105–

108.

MATLAB (2010). version 7.10.0 (R2010a). The Math-

Works Inc., Natick, Massachusetts.

Palm, D., Burns, S., Pasupathy, T., Deip, E., Blair, B.,

Flynn, M., Drewek, A., Sjostrand, M., Stephenson, B.,

and Nordehn, G. (2010). Artiﬁcial Neural Network

Analysis of Heart Sounds Captured From an Acous-

tic Stethoscope and Emailed Using iStethoscopePro.

Journal of Medical Devices, 4(2):027531+.

Pereira, D., Hedayioglu, F., Correia, R., Silva, T., Dutra,

I., Almeida, F., Mattos, S., and Coimbra, M. (2011).

Digiscope - unobtrusive collection and annotating of

auscultations in real hospital environments. In En-

gineering in Medicine and Biology Society, EMBC,

2011 Annual International Conference of the IEEE,

pages 1193 –1196.

Stanford, S. (2012). http://www.peterjbentley.com/ heart-

workshop/challengepaper2.pdf.

Strunic, S. L., Rios-gutirrez, F., Alba-ﬂores, R., Nordehn,

G., and Burns, S. (2007). Detection and Classiﬁca-

tion of Cardiac Murmurs Using Segmentation Tech-

niques and Artiﬁcial Neural Networks. In IEEE Sym-

posium on Computational Intelligence and Data Min-

ing, pages 128–133.

Witten, I. H. and Frank, E. (2005). Data Mining: Practi-

cal Machine Learning Tools and Techniques. Morgan

Kaufmann.

Yoder, N. C. (2009). www.mathworks.com/matlabcentral.

HEALTHINF2013-InternationalConferenceonHealthInformatics

340