Multistage Naive Bayes Classifier with Reject Option for

Multiresolution Signal Representation

Urszula Libal

Institute of Computer Engineering, Control and Robotics, Wroclaw University of Technology, Wroclaw, Poland

Keywords: Multistage Classifier, Naive Bayes, Reject Option, Pattern Recognition, Wavelet, Resolution.

Abstract: In the article, two approaches to pattern recognition of signals are compared: a direct and a multistage. It is

assumed that there are two generic patterns of signals, i.e. a two-class problem is considered. The direct

method classifies signal in one step. The multistage method uses a multiresolution representation of signal

in wavelet bases, starting from a coarse resolution at the first stage to a more detailed resolutions at the next

stages. After a signal is assigned to a class, the posterior probability for this class is counted and compared

with a fixed level. If the probability is higher than this level, the algorithm stops. Otherwise the signal is

rejected and on the next stage the classification procedure is repeated for a higher resolution of signal. The

posterior probability is calculated again. The algorithm stops when the probability is higher than a fixed

level and a signal is finally assigned to a class. The wavelet filtration of signal is used for feature selection

and acts as a magnifier. If the posterior probability of recognition is low on some stage, the number of

features on the next stage is increased by taking a better resolution. The experiments are performed for three

local decision rules: naive Bayes, linear and quadratic discriminant analysis.

1 INTRODUCTION

Sometimes the direct approach to classification does

not give the desired results. Then a classifier with

reject option (

Devroye et al., 1996) may be used. The

object rejection is a cancellation of the object

assignment to one of the classes, if the decision is

not certain on a reasonable level. This approach can

reduce the risk of misclassification (Pudil et al.,

1992).

In opposition to a multistage classifier based on

decision trees (Burduk and Kurzyński, 2006);

(Kurzyński, 1988); (Libal, 2010), the new multistage

approach to classification is presented in this article.

There is proposed a multistage classifier, which is a

sequence of Bayes decision rules with the reject

option. The new classifier is dedicated to signal

recognition and uses wavelet representation of

signals. There are assumed only two classes of

signals. In case of inability to identify the class at

some stage (i.e. signal rejection), it will try to

classify the signal to one of the two classes at the

next stage. It should be noted that at each stage there

are still the same two classes considered, and the

number of steps of the algorithm is not determined

arbitrarily.

To avoid the curse of dimensionality (the empty

space phenomenon), the signal is represented by the

wavelet approximation coefficients in the following

way: at an early stage classifier uses signal

representation in a low resolution. And if it is not

enough (i.e. rejection case), then classifier will use

signal representation in an increased resolution at

the next stage. The method of obtaining wavelet

coefficient vectors 



, 



,…,



by the wavelet

decomposition of signals with the use of the Mallat

algorithm is described in the section 2.1.

2 MULTISTAGE CLASSIFIER

The considered problem is to classify a noised signal

to one of two classes. There is shown a multistage

algorithm with reject option, i.e. on every stage a

local classifier assigns an analysed signal to a class

from 1,2 or rejects it. If the signal was assigned to

class 1 or 2, then algorithm stops. On the other hand,

after the rejection signal stays unclassified and waits

for the classification on the next stage. The

difference between stages is a representation of the

289

Libal U..

Multistage Naive Bayes Classiﬁer with Reject Option for Multiresolution Signal Representation.

DOI: 10.5220/0004266002890292

In Proceedings of the 2nd International Conference on Pattern Recognition Applications and Methods (ICPRAM-2013), pages 289-292

ISBN: 978-989-8565-41-9

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

signal:

1) 



- at the first stage a coarse signal

representation in wavelet bases is

considered (with a small number of wavelet

approximation coefficients for a low

resolution),

2) 



- at the second stage and so on,

⋮

m) 



- till a detailed representation in a high

resolution (with a large number of coefficients).

2.1 Signal Representation

A multiresolution signal representation, in the form

of the sequence 



, 



,…,



, can be obtained

by a wavelet transform of the signal (Mallat, 1989).

The wavelet decomposition of signal (Daubechies,

1992) allows to approximate signal for various

resolutions, lower than the initial resolution of a

signal before the transformation. Fast wavelet

decomposition can be performed by the Mallat

algorithm, shown in figure 1.

Figure 1: Wavelet decomposition by the Mallat algorithm.

At the first decomposition level it filtrates a

signal 







with low- and high-pass filters and then

decimates the filtration results. This procedure

computations are very fast and give two coefficient

vectors: approximation 



and detail 



coefficients:









.1

At every following level the algorithm filtrates the

approximation of signal 



(received at the

previous step) with low- and high-pass filters and

the results are also down-sampled, i.e.:













.2

As noticed by Mallat, the multiresolution

representation of a signal or an image is a very

effective method of extracting information (Mallat,

1989). In figure 2 are presented two patterns (in the

first row) generating two classes of noised signals

(second row). The approximation 



of signal at

decomposition level (third row) reveals a noticeable

division between classes, especially for 9

coefficient.

Figure 2: Two class patterns, noisy signals for 10

and wavelet approximation coefficients 



of 4

level.

2.2 Direct Classifier with Reject Option

The local Bayesian classifier 



with reject option

assigns a signal to one class from the set 1,2

or rejects it what is denoted by zero. The signal is

represented by a sequence of wavelet approximation

coefficients 



. The classification rule 



is given

by formula







if1











ifmin









,1











if











3

where the posterior probability of assigning a

signal to class 1, after observing values of wavelet

coefficient vector , is





















































.4

The posterior probability of assigning a signal to

class 2 is equal to 1







. The prior probabilities

of class occurrences are 



and 



. The probability

density functions in both classes are given by 









and 









The classification rule (3) can be transformed

into the following form







if









∈



1,1



if









∈



,1



if









∈



0,



5

ICPRAM2013-InternationalConferenceonPatternRecognitionApplicationsandMethods

290

what leads to the one-stage local decision rule in the

multistage algorithm, defined in the next section 2.3.

2.3 Multistage Algorithm

The multistage algorithm has the following form:

Algorithm 1. Multistage Classifier

1: for  from  to 1

{

2: if













∈1,1

then class=1

END

3: if











∈,1

then class=0

GO TO STEP 1

4: if











∈0,

then class=2

END

}

5: return class

The maximum number of stages is . There will be

less stages if the posterior probability  achieves an

appropriate level, dependent on the rejection

threshold .

2.4 Theoretical Risk

The difference between risk values for Bayes

classifiers with and without reject option (Libal,

2012), denoted by 



and  respectively, is



















 (6)





min











,1

















⋅

⋅



























0,

where 





is the rejection area if the signal is

represented by a vector  and the rejection threshold

is fixed to . The area 





∅ only if there is

available the reject option, i.e. ∈0,0.5. If

0.5, there is no reject option and the area







∅ is an empty subspace of a feature space.

Examples of rejection areas are presented in figures

3 and 4.

3 EXPERIMENTS

There were 3 multistage classifiers tested with the

basic decision rules as:

1) naive Bayes,

2) linear discriminant analysis (LDA),

3) quadratic discriminant analysis (QDA).

Every one-stage rule was performed under

assumption that all features (i.e. wavelet

approximation coefficients on a fixed decomposition

level) are pairwise uncorrelated. The algorithm has 4

stages and the signals of length 128 were

decomposed for 4 levels by the Daubechies wavelet

family of order 2 (

Daubechies, 1992). The signal was

represented (if it was necessary):

 at 1

stage by 



, (10 coefficients),

 at 2

stage by 



, (18 coefficients),

 at 3

stage by 



, (34 coefficients),

 at 4

stage by 



, (65 coefficients).

The figures 3 and 4 shows in 2D the partition of

a feature space into decision areas of class 1 and 2

and rejection area, but the dimensionality of feature

space in performed experiments was always higher,

i.e. depending on the stage there were 10, 18, 34 or

65-dimensions.

Figure 3: Decision areas for classes and rejection area

(between lines) for LDA.

Figure 4: Decision areas for classes and rejection area

(between curves) for QDA.

MultistageNaiveBayesClassifierwithRejectOptionforMultiresolutionSignalRepresentation

291

4 CONCLUSIONS

According to the inequality (6) the theoretical risk of

one-stage Bayes classifier  without reject option is

higher than or equal to the risk of classifier 



with

reject option, i.e.



















.7

This fact has been confirmed by the experimental

results. In the figures 5, 6 and 7 is shown the risk of

incorrect classification of signals, containing class

patterns and a Gaussian white noise at the level of 

(see figure 2). For the multistage classifier (i.e. for

0.1,0.2,0.3 and 0.4) the risk values from all

four stages were summarized. At the last stage

classifier has to choose a class from 1,2, so there

were no unclassified signals at the end. For 0.5

the classifier has no reject option and only one-stage.

The rejection threshold  is a suffered loss after

rejecting a signal (choosing class 0). The loss after

choosing a wrong class from the set 1,2 is 1, and

there is no loss after choosing a correct class, what

means that the loss is equal to 0 then.

The lower the , the lower the risk of

misclassification. The application of the presented

classifier (given by algorithm 1) for wavelet

representation of signals improves the classifier

efficiency (see figures 5, 6 and 7). The lowest values

of experimental risk among the three methods were

obtained for linear discriminant analysis (LDA).

REFERENCES

Burduk, R. and Kurzyński, M., (2006). Two-stage binary

classifier with fuzzy-valued loss function. Pattern

Analysis & Applications, 9(4), pp.353-358.

Daubechies, I., (1992). Ten Lectures on Wavelets. Lecture

Notes Vol. 61, SIAM, Philadelphia.

Devroye, L. Györfi, L. and Lugosi, G., (1996). A

probabilistic theory of pattern recognition. Springer-

Verlag, New York.

Kurzyński, M., (1988). On the multistage Bayes classifier,

Pattern Recognition, 21(4), pp.355-365.

Libal, U., (2010). Multistage classification of signals with

the use of multiscale wavelet representation. In

MMAR’10, 15th IEEE Int. Conference on Methods

and Models in Automation and Robotics, pp.154-159.

Libal, U., (2012). Multistage pattern recognition of signals

represented in wavelet bases with reject option. In

MMAR’12, 17th IEEE Int. Conference on Methods

and Models in Automation and Robotics, pp.79-84.

Mallat, S.G., (1989). A theory for multiresolution signal

decomposition: the wavelet representation. Pattern

Analysis and Machine Intelligence, IEEE Transactions

on, 11(7), pp.674-693.

Pudil, P., Novovicova, J., Blaha, S., Kittler, J., (1992).

Multistage pattern recognition with reject option. In

Pattern Recognition, Vol.II. Conference B: 11th IAPR

Int. Conference on Pattern Recognition Methodology

and Systems, pp.92-95.

APPENDIX

Figure 5: Risk for naive Bayes classifier.

Figure 6: Risk for linear discriminant analysis (LDA).

Figure 7: Risk for quadratic discriminant analysis (QDA).

ICPRAM2013-InternationalConferenceonPatternRecognitionApplicationsandMethods

292