A RULE-BASED CLASSIFICATION OF LARYNGOPATHIES

BASED ON SPECTRUM DISTURBANCE ANALYSIS

An Exemplary Study

Krzysztof Pancerz

, Wiesław Paja

, Jarosław Szkoła

, Jan Warchoł

and Gra˙zyna Olchowik

Institute of Biomedical Informatics, University of Information Technology and Management in Rzesz´ow, Rzesz´ow, Poland

Department of Biophysics, Medical University of Lublin, Lublin, Poland

Keywords:

Laryngopathy, Classiﬁcation, Spectrum analysis.

Abstract:

Our research concerns data derived from the examined patient’s speech signals for a non-invasive diagnosis

of selected larynx diseases. The paper is devoted to the rule-based classiﬁcation of patients on the basis of

a family of coefﬁcients reﬂecting spectrum disturbances around basic tones and their multiples. The paper

presents a proposed procedure for feature selection and classiﬁcation as well as the experiments carried out on

real-life data.

1 INTRODUCTION

Our research concerns designing methods for classi-

ﬁcation of patients with selected larynx diseases us-

ing a non-invasive diagnosis. Two diseases are taken

into consideration: Reinke’s edema (RE) and laryn-

geal polyp (LP). In general, the classiﬁcation is based

on selected parameters of a patient’s speech signal

(phonation). Gathering parameters in this way is con-

venient for the patient because a measurement instru-

ment (microphone) is located outside the voice or-

gan. This enables free articulation. Moreover, dif-

ferent physiological and psychological patient factors

impede making a diagnosis using direct methods.

There exist various approaches to analysis of bio-

medical signals (cf. (Semmlow, 2009)). In general,

we can distinguish three groups of methods accord-

ing to a domain of the signal analysis: analysis in a

time domain, analysis in a frequency domain (spec-

trum analysis), analysis in a time-frequency domain

(e.g., wavelet analysis). Therefore, in our research,

we are going to build a specialized computer tool for

supporting diagnosis of laryngopathies based on a hy-

brid approach. A decision support system will have

a hierarchical structure based on multiple classiﬁers

working on signals in time and frequency domains.

A series of papers published earlier has presented

approaches designed for the computer tool being de-

veloped. Approaches presented in (Warchoł et al.,

2010) and (Pancerz et al., 2011) have been based on

the speech spectrum analysis. However, in the ap-

proach presented in (Szkoła et al., 2010b), (Szkoła

et al., 2010a), (Szkoła et al., 2011a), and (Szkoła

et al., 2011b), we have detected all non-natural dis-

turbances in articulation of selected phonemes using

the modiﬁed Elman-Jordan networks that are recur-

rent neural networks. They can be used for pattern

recognition in time series data due to their ability to

memorize some information from the past. The ap-

proach initialized in (Pancerz et al., 2011) is extended

in this paper.

2 CLASSIFICATION PROCESS

In the proposed approach, selected features (parame-

ters) reﬂecting patient’s speech spectrum disturbances

around a basic tone and its multiples (harmonics) are

calculated. Clinical experience shows that harmon-

ics in the speech spectrum of a healthy patient are

distributed approximately steadily. However, larynx

diseases may disturb this distribution (cf. (Warchoł,

2006)). Therefore, the analysis of a degree of distur-

bances can support diagnosis of larynx diseases.

Disturbances are expressed by a family of coefﬁ-

cients computed for neighborhoods of a basic tone f

and its four multiples (f

, f

). In a real situ-

ation frequencies f

, f

, etc. are not distributed

steadily (cf. (Warchoł, 2006)). It means, that we need

to ﬁnd a real distribution of harmonics. In the pre-

sented approach, it is done on the basis of the resul-

458

Pancerz K., Paja W., Szkoła J., Warchoł J. and Olchowik G..

A RULE-BASED CLASSIFICATION OF LARYNGOPATHIES BASED ON SPECTRUM DISTURBANCE ANALYSIS - An Exemplary Study.

DOI: 10.5220/0003874304580461

In Proceedings of the International Conference on Bio-inspired Systems and Signal Processing (BIOSIGNALS-2012), pages 458-461

ISBN: 978-989-8425-89-8

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

tant spectrum (a sum of spectrums calculated for se-

lected N-point time windows into which an original

speech signal is divided). For each original frequency

f, a maximum magnitude is searched in the interval

[ f − d

, f + d

]. This maximum value is assumed as

a real harmonic. The original basic tone f

has been

obtained for each patient from histogram created in

the Multi-Dimensional Voice Program (MDVP). It is

a software tool for quantitativeacoustic assessment of

voice quality, calculating various parameters on a sin-

gle vocalization (see (MDVP, 2011)). On the basis of

, its harmonics(for ideal case) havebeen calculated.

Each coefﬁcient expresses the distribution of a

spectrum around a given frequency f. We can dis-

tinguish two types of coefﬁcients: the regularity coef-

ﬁcient R determining a degree of slenderness of this

distribution and the deviation coefﬁcient D determin-

ing a relative difference between a real multiple de-

rived from the spectrum and a multiple calculated on

the basis of the basic tone f

Let a speech signal of a given patient (a vector of

samples S = [s

, s

, . . . , s

]) be given. Our approach

can be expressed as a sequence of the following steps:

1. Normalizing signal sample values to the interval

[−1.0, 1.0], i.e., for each sample s ∈ S, a normal-

ized sample s

norm

is equal to s

norm

max

i=1

2. Dividing the speech signal S into a family W

of N-point disjoint time windows, i.e., W =

, ...,W

], where W

= S[1, . . . , N], W

S[N + 1, . . . , 2N], etc.

3. Calculating harmonics (ﬁrst f

, second f

, third

, and fourth f

) of a patient’s basic tone f

, i.e.,

= 2f

, f

= 3f

, f

= 4f

, and f

= 5f

4. Calculating a resultant discrete spectrum Sp for

the family W of time windows excluding some

initial and ﬁnal ones, based on the Discrete-Time

Fourier Transform (DTFT), see e.g. (Semm-

low, 2009), i.e., X[k] =

∑

n=1

W[n]e

−2πjkn

, where

k = 0, 1, . . . , N − 1. The spectrum Sp(W) is

a vector of magnitudes |X[k]|, i.e., Sp(W) =



|X[0]|, |X[1]|, . . ., |X[



. The resultant discrete

spectrum Sp is calculated as Sp =

∑

W∈W

Sp(W).

5. Finding real multiples (f

, f

, and f

) of

the basic tone f

. A real multiple f

is as-

sumed to correspond to the maximum value of

Sp in a neighborhood of f

, i.e., in the interval

[ f

− d

, f

+ d

], where d

is the input parame-

ter and k = 1, . . . , 4.

6. Calculating the regularity coefﬁcient R

, which

can also be treated as some kind of shape pa-

rameter, for each frequency f

(the basic tone

or its multiple f

, f

, or f

). Two dis-

crete integrals I

and I

of the spectrum are cal-

culated, the ﬁrst one for the frequency interval

[ f − d

, f + d

], the second one for the frequency

interval [ f − d

, f + d

], where d

and d

are input

parameters and d

< d

. The discrete integral I of

a fragment (between points k

and k

) of the spec-

trum Sp is equal to I =

∑

j=k

|X[ j]|. The ratio of I

to I

constitutes the regularity coefﬁcient R

, i.e.,

. If both integrals are equal (ideal case),

then the regularity coefﬁcient R

= 1. In a real

situation, some fuzziness of a spectrum around

a given harmonic frequency can be observed. It

causes that the regularity coefﬁcient R

< 1, be-

cause I

< I

. It is easy to see, that the greater the

fuzziness of the spectrum, the smaller the regular-

ity coefﬁcient (the smaller the slenderness of the

spectrum distribution).

7. Calculating the deviation coefﬁcient D

for each

multiple f

of the basic tone f

. The deviation

coefﬁcient D

is the ratio D

| f

− f

, where |x|

denotes the absolute value of x.

After execution of the procedure consisting of

steps presented above, we obtain a decision table

which is an input for classiﬁcation algorithms. A sam-

ple of the decision table is presented in Table 1.

In the decision table being a training set of cases

for classiﬁers, we assign to each patient one of the

two classes: norm - a norm - for the patient from the

control group, i.e., without disturbances of phonation

conﬁrmed by a phoniatrist opinion, path - pathology

- for the patient either with laryngeal polyp, or with

Reinke’s edema (both clinically conﬁrmed).

The approach presented in this paper has been

tested using classiﬁcation algorithms available in the

popular data mining and machine learning software

tools: WEKA (Witten and Frank, 2005), Rough

Set Exploration System (RSES) (Bazan and Szczuka,

2005), NGTS (Hippe, 1997). Four of them are rule-

based algorithms: exhaustive (RSES) (Bazan et al.,

2000), LEM2 (RSES) (Grzymala-Busse, 1997), Ge-

netic (RSES) (Wr´oblewski, 1998), and NGTS (Hippe,

1997). Two of them are decision-tree based algo-

rithms: J48 (WEKA) - an implementation of C4.5

(Quinlan, 1993), CART (WEKA) (Breiman et al.,

1993).

Values of features can be treated as continuous

quantitative data. Building classiﬁcation rules for

such data can be difﬁcult and/or highly inefﬁcient.

Therefore, for RSES generation algorithms, the so-

called discretization was a necessary preprocessing

A RULE-BASED CLASSIFICATION OF LARYNGOPATHIES BASED ON SPECTRUM DISTURBANCE ANALYSIS

- An Exemplary Study

459

Table 1: A sample of the decision table.

Patient ID R

CLASS

#1 0.90 0.89 0.89 0.82 0.75 0.01 0.00 0.00 0.00 norm

#2 0.85 0.82 0.75 0.59 0.58 0.04 0.03 0.08 0.08 path

... ... ... ... ... ... ... ... ... ... ...

step (Cios et al., 2007). Its overall goal was to re-

duce the number of values by grouping them into a

number of intervals. Some discretization techniques

based on rough sets and Boolean reasoning have been

presented in (Bazan et al., 2000).

The sets of rules obtained using NGTS were im-

proved using the RuleSEEKER system (Błajdo et al.,

2004). The main optimizing process was based on

an exhaustive application of a collection of generic

operations (Paja and Hippe, 2005): ﬁnding and re-

moving redundancy, ﬁnding and removing incorpora-

tive rules, merging rules, ﬁnding and removing un-

necessary rules, ﬁnding and removing unnecessary

conditions, creating missing rules, discovering hidden

rules, rule speciﬁcation, selecting ﬁnal set of rules.

3 EXPERIMENTS

In the experiments, sound samples were analyzed.

The experiments were carried out on two groups

(Warchoł, 2006). The ﬁrst group included persons

without disturbances of phonation - the control group

(CG). They were conﬁrmed by a phoniatrist opin-

ion. All persons were non-smoking, so they did not

have contact with toxic substances which can have

an inﬂuence on the physiological state of vocal folds.

The second group included patients of Otolaryngol-

ogy Clinic of the Medical University of Lublin in

Poland. They had clinically conﬁrmed dysphonia as

a result of Reinke’s edema (RE) or laryngeal polyp

(LP). Experiments were carried out by a course of

breathing exercises with instruction about the way of

articulation. The task of all examined patients was to

utter separately different Polish vowels with extended

articulation as long as possible, without intonation,

and each on separate expiration.Each sound sample

was recorded on MiniDisc MZ-R55 (Sony). Effec-

tiveness of such an analysis was conﬁrmed by Win-

holtz and Titze in 1998 (Winholtz and Titze, 1998).

For experiments, we have tested various combina-

tions of input parameters. The best classiﬁcation re-

sults have been obtained for the following parameters:

N = 8192 - the number of points (samples) taken for

DTFT, d

= 12 - deviation for searching maximum,

= 4, d

= 8 - deviations for calculating spectrum

regularity coefﬁcients.

To determine accuracy of generated rules by clas-

siﬁcation algorithms a cross-validation method was

used. Cross-validation is frequently used as a method

for evaluating classiﬁcation models. It comprises of

several training and testing runs. First, the data set is

split into several, possibly equal in size, disjoint parts.

Then, one of the parts is taken as a training set for rule

generation and the remainder (sum of all other parts)

becomes the test set for rule validation. In our exper-

iments, a standard 10 cross-validation test was used

(CV-10). Each classiﬁcation algorithm was evaluated

via testing the classiﬁcation accuracy of unseen cases.

Results of evaluation are collected in Table 2.

Table 2: Results of experiments: classiﬁcation accuracy.

Algorithm Classiﬁcation accurracy

Exhaustive (RSES) 0.8430

LEM2 (RSES) 0.8700

Genetic (RSES) 0.8430

J48 (WEKA) 0.8441

CART (WEKA) 0.7922

NGTS 0.6786

NGTS (partial matching) 0.7786

NGTS (after optimization) 0.7214

NGTS (after optimization, 0.8071

partial matching)

Almost all popular rule-based algorithms of ma-

chine learning and data mining manifest very simi-

lar classiﬁcation accuracy. The obtained result can be

treated as promising. In general, the considered prob-

lem is not simple. An important role is played by the

quality of speech recording. The quality of results

is also dependent on chosen preprocessing methods

(e.g. ﬁltration) and signal processing methods, e.g.

Discrete-Time Fourier Transform (DTFT). An espe-

cially difﬁcult task is the extraction of the correct sig-

nal.

4 CONCLUSIONS

On the basis of experiments described in the paper,we

can notice that a family of coefﬁcients, calculated on

the basis of the analysis of spectrum shapes and de-

viations around a basic tone and its multiples for the

BIOSIGNALS 2012 - International Conference on Bio-inspired Systems and Signal Processing

460

examined patient’s speech signal, may be helpful for

classiﬁcation of a patient. In general, the speech anal-

ysis of patients with larynx diseases is a difﬁcult task.

Therefore, the obtained results seem to be promising.

In the future, we plan to tune parameters reﬂect-

ing regularity and deviations in the speech spectrum,

especially by selecting proper regions of interest of

the spectrum and we are going to add some new coef-

ﬁcients characterizing spectrum disturbances as well.

The proposed approach based on analysis of speech

signals in a frequency domain can be a part of a com-

puter tool based on multicriteria decision making pro-

cess. It should be treated as a supplementary element

for other techniques. An important challenge is to

design methods enabling distinction between differ-

ent larynx diseases (for example, laryngeal polyp and

Reinke’s edema). The approach presented in this pa-

per does not enable us to make this distinction.

ACKNOWLEDGEMENTS

This research has been supported by the grant No. N

N516 423938 from the Polish Ministry of Science and

Higher Education.

REFERENCES

Bazan, J. G., Nguyen, H. S., Nguyen, S. H., Synak, P., and

Wroblewski, J. (2000). Rough set algorithms in classi-

ﬁcation problem. In Polkowski, L., Tsumoto, S., and

Lin, T. Y., editors, Rough Set Methods and Applica-

tions, pages 49–88. Physica-Verlag, Heidelberg, Ger-

many.

Bazan, J. G. and Szczuka, M. S. (2005). The Rough Set

Exploration System. In Peters, J. and Skowron, A.,

editors, Transactions on Rough Sets III, pages 37–56.

Springer-Verlag, Berlin Heidelberg.

Błajdo, P., Grzymała-Busse, J., Hippe, Z., Knap, M.,

Marek, T., Mroczek, T., and Wrzesie´n, M. (2004). A

suite of machine learning programs for data mining:

chemical applications. In Debska, B. and Fic, G., edi-

tors, Information Systems In Chemistry 2, pages 7–14.

University of Technology Editorial Ofﬁce, Rzeszow.

Breiman, L., Friedman, J., Olshen, R., and Stone, C. (1993).

Classiﬁcation and Regression Trees. Chapman &

Hall, Boca Raton.

Cios, K., Pedrycz, W., Swiniarski, R., and Kurgan, L.

(2007). Data mining. A knowledge dis-covery ap-

proach. Springer, New York.

Grzymala-Busse, J. (1997). A new version of the rule induc-

tion system LERS. Fundamenta Informaticae, 31:27–

39.

Hippe, Z. (1997). Machine learning a promising strategy

for business information processing? In Abramowicz,

W., editor, Business Information Systems, pages 603–

622. Academy of Economics Editorial Ofﬁce, Poznan.

MDVP (2011). Multi-dimensional voice program (MDVP).

http://www.kayelemetrics.com.

Paja, W. and Hippe, Z. (2005). Feasibility studies of

quality of knowledge mined from multiple secondary

sources. I. Implementation of generic operations. In

Klopotek, M., Wierzchon, S., and Trojanowski, K.,

editors, Intelligent Information Processing and Web

Mining, pages 461–465. Springer-Verlag, Berlin Hei-

delberg.

Pancerz, K., Szkoła, J., Warchoł, J., and Olchowik, G.

(2011). Spectrum disturbance analysis for computer-

aided diagnosis of laryngopathies: An exemplary

study. In Proc. of the International Workshop on

Biomedical Informatics and Biometric Technologies

(BT’2011), Zilina, Slovak Republic.

Quinlan, J. (1993). C4.5. Programs for machine learning.

Morgan Kaufmann Publishers.

Semmlow, J. (2009). Biosignal and Medical Image Pro-

cessing. CRC Press.

Szkoła, J., Pancerz, K., and Warchoł, J. (2010a). Computer-

based clinical decision support for laryngopathies us-

ing recurrent neural networks. In Hassanien, A.

et al., editors, Proc. of the ISDA’2010, pages 627–632,

Cairo, Egypt.

Szkoła, J., Pancerz, K., and Warchoł, J. (2010b). Com-

puter diagnosis of laryngopathies based on temporal

pattern recognition in speech signal. Bio-Algorithms

and Med-Systems, 6(12):75–80.

Szkoła, J., Pancerz, K., and Warchoł, J. (2011a). Improv-

ing learning ability of recurrent neural networks: Ex-

periments on speech signals of patients with laryn-

gopathies. In Babiloni, F. et al., editors, Proc. of the

BIOSIGNALS’2011, pages 360–364, Rome, Italy.

Szkoła, J., Pancerz, K., and Warchoł, J. (2011b). Recurrent

neural networks in computer-based clinical decision

support for laryngopathies: An experimental study.

Computational Intelligence and Neuroscience, 2011.

Article ID 289398.

Warchoł, J. (2006). Speech Examination with Correct and

Pathological Phonation Using the SVAN 912AE Anal-

yser (in Polish). PhD thesis, Medical University of

Lublin.

Warchoł, J., Szkoła, J., and Pancerz, K. (2010). Towards

computer diagnosis of laryngopathies based on speech

spectrum analysis: A preliminary approach. In Fred,

A., Filipe, J., and Gamboa, H., editors, Proc. of the

BIOSIGNALS’2010, pages 464–467, Valencia, Spain.

Winholtz, W. and Titze, I. (1998). Suitability of minidisc

(MD) recordings for voice perturbation analysis. Jour-

nal of Voice, 12(2):138–142.

Witten, I. H. and Frank, E. (2005). Data Mining: Practi-

cal Machine Learning Tools and Techniques. Morgan

Kaufmann.

Wr´oblewski, J. (1998). Genetic algorithms in decomposi-

tion and classiﬁcation problem. In Polkowski, L. and

Skowron, A., editors, Rough Sets in Knowledge Dis-

covery 2, volume 2, pages 471–487. Physica-Verlag,

Heidelberg, Germany.

A RULE-BASED CLASSIFICATION OF LARYNGOPATHIES BASED ON SPECTRUM DISTURBANCE ANALYSIS

- An Exemplary Study

461