Classiﬁcation of HEp-2 Staining Patterns in ImmunoFluorescence

Images

Comparison of Support Vector Machines and Subclass Discriminant Analysis

Strategies

Ihtesham Ul Islam, Santa Di Cataldo, Andrea Bottino, Elisa Ficarra and Enrico Macii

Dipartimento di Automatica e Informatica, Politecnico di Torino, Corso Duca degli Abruzzi 24, Torino, Italy

Keywords:

HEp-2 cells, Indirect ImmunoFluorescence, Staining Pattern Classiﬁcation, Support Vector Machines,

Subclass Discriminant Analysis, Image Processing.

Abstract:

Anti-nuclear antibodies test is based on the visual evaluation of the intensity and staining pattern in HEp-2

cell slides by means of indirect immunoﬂuorescence (IIF) imaging, revealing the presence of autoantibod-

ies responsible for important immune pathologies. In particular, the categorization of the staining pattern is

crucial for differential diagnosis, because it provides information about autoantibodies type. Their manual

classiﬁcation is very time-consuming and not very reliable, since it depends on the subjectivity and on the

experience of the specialist. This motivates the growing demand for computer-aided solutions able to perform

staining pattern classiﬁcation in a fully automated way. In this work we compare two classiﬁcation techniques,

based respectively on Support Vector Machines and Subclass Discriminant Analysis. A set of textural features

characterizing the available samples are ﬁrst extracted. Then, a feature selection scheme is applied in order

to produce different datasets, containing a limited number of image attributes that are best suited to the clas-

siﬁcation purpose. Experiments on IIF images showed that our computer-aided method is able to identify

staining patterns with an average accuracy of about 91% and demonstrate, in this speciﬁc problem, a better

performance of Subclass Discriminant Analysis with respect to Support Vector Machines.

1 INTRODUCTION

Indirect immunoﬂuorescence (IIF) is an imaging

modality detecting abundance of those molecules that

induce an immune response in the sample tissue.

This technique uses the speciﬁcity of antibodies to

their antigen in order to bind ﬂuorescent dyes to spe-

ciﬁc biomolecule targets within a cell. The screen-

ing for anti-nuclear antibodies by IIF is a standard

method in the current diagnostic approach to a num-

ber of important autoimmune pathologies such as sys-

temic rheumatic diseases as well as Multiple Scle-

rosis and Diabetes (Egerer, 2010). This screening,

which makes use of a ﬂuorescence microscope, is typ-

ically done by visual inspection on cultured cells of

the HEp-2 cell line: the specialist observes the IIF

slide at the microscope (see Fig. 1 for an example),

and makes a diagnosis based on the perceived inten-

sity of the ﬂuorescence signal and on the type of the

staining pattern. Fluorescence intensity evaluation is

needed for classifying between positive, intermediate

and negative samples. Then, speciﬁc staining patterns

on positive and intermediate samples reveal the pres-

ence of different antibodies and, thus, different types

of autoimmune diseases. Therefore, a correct descrip-

tion of staining patterns is fundamental for the differ-

ential diagnosis of the pathologies. Examples of the

six main staining patterns described by literature (ho-

mogeneous, ﬁne speckled, coarse speckled, nucleo-

lar, cytoplasmic or centromere) are reported in Fig. 2.

They are distinguished as follows:

• Homogeneous: diffuse staining of the entire nu-

cleus, with or without apparent masking of the nu-

cleoli.

• Nucleolar: ﬂuorescent staining of the nucleoli

within the nucleus, sharply separated from the un-

stained nucleoplasm.

• Coarse/Fine Speckled: ﬂuorescent aggregates

throughout the nucleus which can be very ﬁne

to very coarse depending on the type of antibody

present.

• Centromere: discrete uniform speckles through-

out the nucleus, the number corresponds to a mul-

tiple of the normal chromosome number.

• Cytoplasmic Fluorescence: granular or ﬁbrous

ﬂuorescence in the cytoplasm.

Ul Islam I., Di Cataldo S., Bottino A., Ficarra E. and Macii E..

Classiﬁcation of HEp-2 Staining Patterns in ImmunoFluorescence Images - Comparison of Support Vector Machines and Subclass Discriminant

Analysis Strategies.

DOI: 10.5220/0004244100530061

In Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms (BIOINFORMATICS-2013), pages 53-61

ISBN: 978-989-8565-35-8

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

The manual classiﬁcation of HEp-2 staining pat-

tern suffers from usual problems in medical imaging,

that is (i) the reliability of the results depends on the

specialist’s experience and expertise, and (ii) the anal-

ysis of large volume of images is a tedious and time-

consuming operation, translating into higher costs for

the health system. Studies report very high inter- and

intra-laboratory variability for this type of screening

(up-to 10%), that can be even higher in case of non-

specialized structures (Egerer, 2010). This variability

impacts on the reliability of the obtained results and,

most of all, on their reproducibility.

Thus, in the last few years reliable automatic sys-

tems for automating the whole IIF process have been

in great demand and several tools have been pro-

posed (Creemers,2011; Hsieh, 2009; Perner, 2002;

Sack, 2003; Soda, 2007). Nevertheless, the accurate

classiﬁcation of the staining patterns still remains a

challenge.

Several classiﬁcation schemes have been ap-

plied: among the others, learning vector quantiza-

tion (Hsieh, 2009), decision tree induction algo-

rithms (Perner, 2002; Sack, 2003), and multi-expert

systems (Soda, 2007). Unfortunately, direct compar-

ison of the results presented by different works is not

possible, since they are obtained on different datasets

and on different classes. However, it is worth noting

that textural features are generally acknowledged for

being the most appropriate for staining pattern classi-

ﬁcation.

In this work, we compare two techniques that clas-

sify the cells into one of the six staining patterns ad-

dressed by literature. The ﬁrst is based on Support

Vector Machines (SVM). This approach was already

introduced in our previous work (Di Cataldo, 2012),

and it is described again here for the sake of com-

pleteness. The second technique is a novel procedure

based on Subclass Discriminant Analysis (SDA), a re-

cent dimensionality reduction method that has been

proven successful in different problems. SDA aims at

classifying a large number of different data distribu-

tions, whether they are composed by compact sets or

not, by describing the underlying distribution of each

class using a mixture of Gaussians. Since some of

the staining patterns are characterized by a not negli-

gible within-class variance, SDA can be a promising

method for their classiﬁcation.

In our approach, an initial set of features, based

on statistical measurements of the grey-level distri-

butions and on frequency-domain transformations, is

used to characterize each cell. The dimension of this

feature vector is then reduced applying different pro-

cedures, aiming at selecting the feature variables that

are best suited to the classiﬁcation with both SVM and

SDA.

After a description of the dataset employed for

training and testing our methods, Section 2, and a

description of the two classiﬁcation techniques, Sec-

tion 3, this work presents in Section 4 a comparison

between the two methods in order to select the best

IIF classiﬁcation technique. Discussion about results

and conclusions are presented in Sections 4.3 and 5,

respectively.

2 MATERIALS

Our dataset contains IIF images that are publicly

available at (MIVIALab, 2012). It consists of 14

annotated IIF images acquired using slides of HEp-

2 substrate at the ﬁxed dilution of 1:80, as recom-

mended by the guidelines in (Tozzoli, 2002). The

images were acquired with a resolution of 1388x1038

pixels and a color depth of 24 bits. The acquisition

unit consisted of a ﬂuorescence microscope (40-fold

magniﬁcation) coupled with a 50W mercury vapour

lamp and with a digital camera (SLIM system by Das

srl) having a CCD with square pixel of 6.45 µm side.

An example of the available images can be seen in

Fig. 1.

From these images, a set of samples of HeP-2 cells

have been extracted. Specialists manually segmented

each cell at a workstation monitor, labelling it with

the corresponding ﬂuorescence intensity level (either

intermediate or positive) and staining pattern. The lat-

ter can be distinguished in the six classes described in

the introduction.

Figure 1: HEp-2 IIF image.

The dataset contains a total of 721 cells, 325 of

which with intermediate and 396 with positive ﬂuo-

rescence intensity (see Table 1 for a full characteri-

zation).

BIOINFORMATICS2013-InternationalConferenceonBioinformaticsModels,MethodsandAlgorithms

Figure 2: Examples of staining patterns that are considered relevant to diagnostic purposes, either with intermediate and

positive ﬂuorescence intensity.

Table 1: HEp-2 cell dataset characterization.

Pattern n. of samples interm. pos.

Homogeneous 150 47 103

Nucleolar 102 46 56

Coarse speckled 109 41 68

Fine speckled 94 48 46

Centromere 208 119 89

Cytoplasmic 58 24 34

tot. 721 325 396

The staining pattern information provided by the

specialists was used as ground truth to assess the re-

sults of the proposed classiﬁer.

3 METHODS

Our approach combines texture analysis and feature

selection techniques in order to obtain a limited set

of image features that is optimal for the classiﬁca-

tion task. As already mentioned, for classiﬁcation

we implemented and compared two different meth-

ods, based on SVM and on SDA.

In the following subsections we provide details

about all the steps of the proposed techniques.

3.1 Size and Contrast Normalization

In order to obtain image features independent from

variations of cell size and staining intensity, we ap-

plied size and contrast normalization to all the sam-

ples in our dataset. This avoided, as well, the neces-

sity of training two different classiﬁers, one for inter-

mediate and one for positive samples.

Size normalization was obtained by re-sampling

all the cell images to 64x64 pixels dimension. Con-

trast normalization consisted in linearly remapping

the intensity values so that 1% of data is saturated at

low and high intensities.

3.2 Feature Extraction

Textural analysis techniques have already been

proven successful in HEp-2 cell staining characteri-

zation (Di Cataldo, 2012). In fact, they are able to

describe the most relevant image variations occurring

in the cell allowing to differentiate between the stain-

ing patterns.

The two major approaches for textural analysis are

either based on statistical methods describing the dis-

tribution of grey-levels in the image or on frequency-

domain measurements of image variations. In our

work we propose a combination of both of them in

order to extract a comprehensive set of features able

to fully characterize the staining pattern of the cell.

A ﬁrst set of features was computed based on

Gray-Level Co-occurrence Matrices (GLCM), a well

established technique that extracts textural informa-

tion from the spatial relationship between intensity

values at speciﬁed offsets in the image. More specif-

ically, textural features are computed from a set of

grey-tone spacial dependence matrices reporting the

distribution of co-occurring values between neigh-

bouring pixels according to different angles and dis-

tances (Haralick, 1973). In practice, the GLCM ele-

ment GLCM(i, j)

d,θ

contains the probability for a pair

of pixels located at a neighbourhood distance d and

direction θ to have gray levels i and j, respectively.

ClassificationofHEp-2StainingPatternsinImmunoFluorescenceImages-ComparisonofSupportVectorMachinesand

SubclassDiscriminantAnalysisStrategies

In our work, we extracted 44 GLCM textural fea-

tures, based on 16x16 GLCMs computed for a ﬁxed

unitarian neighbourhood distance and a varying angle

θ = 0

, 45

, 90

, 135

(see (Di Cataldo, 2012) for de-

tails). The features are based on well-established sta-

tistical measurements whose characterization can be

found in (Haralick, 1973; Soh, 1999; Clausi, 2002).

The use of 4 different directions is aimed at making

the method less sensitive to rotations in the images.

Besides statistical methods, a largely used ap-

proach to extract relevant textural information for

image compression and classiﬁcation is based on

frequency-domain transformations (Sorwar, 2001).

The underlying concept is the transformation to a dif-

ferent space whose coordinate system has an interpre-

tation that is closely related to the description of im-

age texture.

In our work, we computed the two-dimensional

Discrete Cosine Transform (DCT) (Ahmed, 1974)

of the normalized images and then extracted 328

DCT coefﬁcients, which are described in details in

(Di Cataldo, 2012), representing different patterns of

image variation and directional information of the

texture. The same approach was already successfully

applied for texture classiﬁcation and pattern recogni-

tion (Sorwar, 2001).

Combining GLCM and DCT sets, we obtained a

total number of 44 + 328 = 372 features to character-

ize each sample image.

3.3 Classiﬁcation based on Support

Vector Machines

The ﬁrst classiﬁcation method we implemented was

already introduced in our previous work (Di Cataldo,

2012). It is based on Support Vector Machines

(SVM), a well-established machine learning tech-

nique that has been proven successful for classi-

ﬁcation and regression purposes in many applica-

tions (Chang, 2011).

The classiﬁcation is based on the implicit map-

ping of data to a high-dimensional space via a kernel

function, and on the identiﬁcation of the maximum-

margin hyperplane that separates the given training

instances in this space.

In our work we used SVM with radial basis kernel,

optimizing the kernel parameters by means of ten-fold

cross-validation technique and a grid search, as sug-

gested in (Chang, 2011).

3.3.1 Feature Selection

Feature selection (FS) strategies were applied in or-

der to select a limited set of optimal features able to

improve the accuracy of the staining pattern classiﬁer.

SVM are widely acknowledged for their built-in

feature selection capability, as they implicitly map

data in a transformed domain where the features that

are crucial to the classiﬁcation purpose are empha-

sized (Temko, 2006). Nevertheless, the combination

of SVM with feature selection strategies, besides im-

proving training efﬁciency, can further enhance the

accuracy of classiﬁcation. In fact, although the pres-

ence of irrelevant features does not change the hyper-

plane margin of SVM, it may increase the radius of

the training data points, impacting on SVM’s gener-

alization capability and also increasing the probability

of over-ﬁtting (Weston, 2000).

In our work we applied feature selection in two

sequential steps. The ﬁrst is based on minimum-

Redundancy-Maximum-Relevance (mRMR) algo-

rithm, whose better performance over the conven-

tional top-ranking methods has been widely demon-

strated in the literature (Peng, 2005). The mRMR al-

gorithm sorts the features that are most relevant for

characterizing the classiﬁcation variable, pointing at

the contemporaneous minimization of their mutual

similarity and maximization of their correlation with

the classiﬁcation variable. The number of the candi-

date features selected by mRMR was arbitrarily set to

50.

As for mRMR to work at its best the classiﬁcation

variables have to be categorical and not continuous,

we preventively performed features discretization on

the input data. For this purpose, we applied CAIM

(class-attribute interdependence maximization) algo-

rithm (Kurgan, 2004), which is best suited to work

with supervised data, as it generates a minimal num-

ber of discrete intervals by maximizing the class-

attribute interdependence.

The output of mRMR is a generic candidate fea-

ture set, which is independent from the classiﬁcation

algorithm (Peng, 2005) and not necessarily optimal

for SVM. Therefore, we applied as second FS step a

Sequential Forward Selection (SFS) scheme in order

to iteratively construct the subset of optimal features

that is best suited for SVM classiﬁcation.

Classical SFS works towards the minimization

of the misclassiﬁcation error: starting from an ini-

tial empty set, at each iteration the feature pro-

viding the greatest classiﬁcation accuracy improve-

ment is added, until no more improvement is ob-

tained (Ververidis, 2008). As this implementation

tends to be trapped in local minima, in our approach

we proceeded with the iterations until all the available

features were added, and then we selected the feature

set with the best classiﬁcation accuracy. The ﬁnal di-

mension of this optimal set was found to be 12 (see

BIOINFORMATICS2013-InternationalConferenceonBioinformaticsModels,MethodsandAlgorithms

Fig. 3).

Figure 3: Sequential Feature Selection strategy: misclassi-

ﬁcation error vs. number of selected features at each itera-

tion. The optimal feature set size is 12.

3.4 Classiﬁcation based on Subclass

Discriminant Analysis

Discriminant analysis (DA) algorithms have been

used for dimensionality reduction and feature extrac-

tion in many applications of computer vision (Fuku-

naga, 1990; Swets, 1996; Etemad, 1997; Belhumeur,

1997). These algorithms project and map a set of sam-

ples X = (x

,... ,x

), in a high-dimensional feature

space ∈ ℜ

and each with an associated class label

∈ [1,C], to a low-dimensional subspace ∈ ℜ

, with

dD, where the data can be more easily separated ac-

cording to their class-labels. Therefore, DA problem

can be generally stated as ﬁnding the transformation

matrix V = (v

,. .. ,v

), with v

∈ ℜ

, for mapping

a sample x into the ﬁnal d-dimensional subspace.

In most DA algorithms, the transformation matrix

V is found by maximizing the so-called Fisher-Rao’s

criterion:

J(V ) =



(1)

where A and B are symmetric and positive-deﬁned

matrices, so that they deﬁne a metric. The solution

to this problem is given by the generalized eigenvalue

decomposition:

AV = BV Λ (2)

Where V is (as above) the desired transformation ma-

trix, and Λ is a diagonal matrix of its corresponding

eigenvalues.

Linear Discriminant Analysis (LDA) is probably

the most well-known DA technique. This method as-

sumes that the C classes the data belong to are ho-

moscedastic, that is their underlying distributions are

Gaussian with common variance and different means.

In (1), LDA uses A = S

, the between-class matrix,

and B = S

, the within-class scatter matrix, deﬁned

as:

∑

i=1

(µ

− µ)(µ

− µ)

(3)

∑

i=1

∑

j=1

i j

− µ

)(x

i j

− µ

)

(4)

where C is the number of classes, µ

is the sample

mean for class i, µ is the global mean, x

i j

is the j

sample of class i and n

the number of samples in class

LDA provides the (C-1)-dimensional subspace

that maximizes the between-class variance and min-

imizes the within-class variance in any particular data

set. In other words, it guarantees maximal class sepa-

rability and, possibly, optimizes the accuracy in later

classiﬁcation.

However, the assumption of having C ho-

moscedastic classes is the very limitation of this

method. LDA works well for linear problems and

fails to provide optimal subspaces for inherently non-

linear structures in data. Several extensions of LDA

have been introduced in literature to effectively clas-

sify data with non-linearities (Boulgouris, 2009).

To this end, one of the most effective approaches

is the Subclass Discriminant Analysis (SDA), pro-

posed in (Zhu, 2006). The main idea of SDA it is to

ﬁnd a way to describe a large number of different data

distributions, whether they are composed by compact

sets or not, by describing the underlying distribution

of each class using a mixture of Gaussians. This

is achieved by dividing the classes into subclasses.

Therefore, the problem to be solved is to ﬁnd the opti-

mal number of subclasses maximizing the classiﬁca-

tion accuracy in the reduced space. In SDA, the trans-

formation matrix V is found by deﬁning the between-

subclass scatter matrix S

in equation (1) as:

C−1

∑

i=1

∑

j=1

∑

k=i+1

∑

l=1

i j

(µ

i j

− µ

)(µ

i j

− µ

)

(5)

where H

is the number of subclasses of class i,

i j

and p

i j

are the mean and prior probability of the

subclass of class i, respectively. The priors are

estimated as p

i j

= n

i j

/n, where n

i j

is the number of

samples in the j

subclass of class i. In the simplest

case of SDA with no class subdivisions, this equation

reduces to that of LDA.

In order to select the optimal number of sub-

classes, in (Zhu, 2006), the authors propose two dif-

ferent methods. The ﬁrst is based on a stability cri-

terion described in (Martinez, 2005). However, as

ClassificationofHEp-2StainingPatternsinImmunoFluorescenceImages-ComparisonofSupportVectorMachinesand

SubclassDiscriminantAnalysisStrategies

pointed out in (Gkalelis, 2011), when data have a

Gaussian homoscedastic subclass structure, the min-

imization of the metric used in this criterion is not

guaranteed. Authors in (Gkalelis, 2011) hypothesize

that this is likely to happen also for heteroscedastic

classes.

The second selection criterion is based on a

Leave-one-object test. In practice, for each subdi-

vision, a leave-one-out cross validation (LOOCV) is

applied, and the optimal subdivision is the one giving

the maximal recognition rate. The problem with this

strategy is that it has very high computational costs,

especially when the dataset to classify is large and the

number of classes is high. This is exactly what is hap-

pening in our case, where the initial classes are 6 and

the samples are 721.

Therefore, to overcome these problems, we used a

different formulation of the optimality criterion, sim-

ilar to the leave-one-object test, but based on a strat-

iﬁed 5-fold cross validation, which optimizes the ac-

curacies obtained with a k-Nearest Neighbour (kNN)

classiﬁer. A value of 8 for k has been heuristically

found to provide good classiﬁcation results.

Our implementation differs from the original SDA

formulation for two other details. The ﬁrst concerns

the clustering methods used to divide classes into sub-

classes. In (Zhu, 2006) data are assigned to subclasses

by ﬁrst sorting the class samples with a Nearest-

Neighbour based algorithm and then dividing the ob-

tained list into a set of clusters of the same size. How-

ever, this method does not allow to model efﬁciently

the non-linearity present in the data, as in the case of

staining patterns under analysis. Therefore, we used

the K-means algorithm, which partitions the samples

into k clusters by minimizing iteratively the sum, over

all clusters, of the within-cluster sums of sample-to-

cluster-centroid distances. Since, in this method, the

centroids are initially set at random, different initial-

ization results in different divisions. Hence, we re-

peated the clustering 20 times and kept the solution

providing the minimal sum of all within-cluster dis-

tances.

The second difference is that, instead of increasing

the number of subclasses for each class of the same

amount at each iteration, all the possible permutation

of class subdivisions are created by iteratively incre-

menting by one the number of subclasses of a single

class in a set of nested loops. For a speciﬁc class r,

the subdivision process is stopped when the minimal

number of samples in the H

clusters obtained with K-

means drops below a predeﬁned threshold. In order to

reduce the computational times, the clusters created

in inner loops are computed only once and cached for

further use.

The classiﬁcation accuracy of our method is com-

puted as the average accuracy of the different CV

rounds. It should be underlined that, given the dif-

ferences between training and test sets, different op-

timal subclasses subdivision are likely to be obtained

at each CV iteration.

3.4.1 Feature Selection

As well as for the SVM classiﬁer, we applied FS

strategies to SDA too. In this case, we used only the

reduced feature set obtained with mRMR. This has

been done for two reasons.

First, while mRMR is independent from the clas-

siﬁcation method, SFS relies on the classiﬁer output,

which makes it unfeasible with the computational cost

of SDA.

Second, it can be easily shown that the rank of

matrix S

, and therefore of the dimensionality d of

the reduced subspace obtained from equation (2), is

given by min(H − 1, rank(S

)), where H is the to-

tal number of subclasses and rank(S

) is equal (or

minor) to the number of features characterizing each

sample. While the number of features selected with

mRMR (50) is a reasonable upper bound for d, reduc-

ing it further might hamper the possibility to obtain a

good classiﬁcation accuracy in problems, like the one

tackled in this paper, in which the data present high

non-linearities.

4 RESULTS

The two classiﬁcation methods presented in Section 3

were tested on the same annotated IIF images, using

the staining pattern information provided by the spe-

cialists as ground truth for cross-validation.

4.1 SVM Classiﬁcation

We recall here the experimental results on SVM

classiﬁcation, already reported in our previous pa-

per (Di Cataldo, 2012), for comparison with SDA ap-

proach.

As for SVM classiﬁcation, experiments were run

on the following datasets:

dataset I, the initial 372 elements feature set;

dataset II, the 50 elements candidate set selected by

mRMR feature selection;

dataset III, the ﬁnal 12 elements feature vector ob-

tained with combination of mRMR + SFS.

10-fold cross-validation accuracy results are re-

ported in Table 2, grouped by staining pattern. The

BIOINFORMATICS2013-InternationalConferenceonBioinformaticsModels,MethodsandAlgorithms

last row of the table reports the overall accuracy ob-

tained by SVM in each dataset.

Table 2: SVM Classiﬁcation results: accuracy rate (%).

Pattern dataset I dataset II dataset III

Homogeneous 78.66 84.00 86.00

Nucleolar 89.22 93.14 93.14

Coarse speckled 92.66 95.41 98.17

Fine speckled 45.75 61.70 71.28

Centromere 84.13 88.46 87.02

Cytoplasmic 58.62 86.21 82.76

overall 77.95 85.58 86.96

These are the main considerations that arise from

analysing the results reported in Table 2:

• SVM classiﬁer obtained an average accuracy of

86.96% in the six staining patterns. The max-

imum and minimum per-class accuracy were

98.17% (coarse speckled pattern) and 71.28%

(ﬁne speckled pattern).

• the improvement of SVM classiﬁcation accuracy

due to FS strategies was signiﬁcant (+9.01% on

the overall average accuracy). This conﬁrms the

considerations drawn in Section 3.3.1 about the

weakness of the implicit feature selection abil-

ity claimed by SVM. In particular, mRMR im-

proved the per-class accuracy of all the staining

patterns (see results on dataset II compared to

those on dataset I). The combination mRMR+SFS

(dataset III) further improved the average accu-

racy of SVM. While per-class accuracies of cen-

tromere and cytoplasmic patterns were slightly

decreased, the ﬁne speckle pattern, that had low-

est per-class accuracy, is the class that obtained

the best improvement (+9.58% w.r.t. dataset II

and +25.53% w.r.t. dataset I). This non-uniform

behaviour is not surprising, since SFS optimized

the average classiﬁcation accuracy in the over-

all dataset and not the accuracies of the single

classes.

4.2 SDA Classiﬁcation

Table 3, which is, again, organized by staining class,

summarizes the classiﬁcation results obtained with

SDA. As already explained in Section 3.4, SFS strat-

egy was not applied in combination with SDA clas-

siﬁer. Therefore, the table contains only results on

dataset I (the initial 372 feature set) and dataset II (50

feature set obtained after mRMR).

LDA results (which are those obtained with SDA

with no class subdivisions) are also provided for com-

parison, in order to demonstrate the effective capabil-

ities of SDA to better classify datasets with high non-

Table 3: SDA Classiﬁcation results: accuracy rate (%).

dataset I dataset II

Pattern LDA SDA LDA SDA

Homogeneous 63.33 80.67 80.00 85.33

Nucleolar 90.29 94.29 91.19 96.14

Coarse speckled 87.19 88.05 92.55 91.69

Fine speckled 62.87 79.71 65.91 89.42

Centromere 75.96 73.53 80.78 82.17

Cytoplasmic 92.88 100.00 91.52 100.00

overall 78.75 86.04 83.66 90.79

linearities. Finally, in the last row we show the overall

accuracies obtained in the four cases.

Analysing the results, some considerations can be

drawn:

• as expected, the overall accuracy of SDA out-

performs that of LDA (+7.29% on dataset I and

+7.13% on dataset II). Concerning the per-class

results, better results are obtained in most of the

cases (except for centromere class for dataset I,

and coarse speckled class for dataset II), with, as

best improvements, a +17.34% in dataset I for ho-

mogeneous class and + 23.51% in dataset II for

ﬁne speckled class;

• as in the SVM experiments, FS effectively im-

proves the SDA accuracies of all classes (the best

improvement being the +9.71% of ﬁne speckled).

The overall improvement is +4.75%

• the best average accuracy obtained is 90.79%,

with dataset II, which outperforms the best accu-

racy obtained by SVM with mRMR+SFS feature

selection (86.96%, dataset III). The best per-class

improvements have been obtained for ﬁne speck-

led (+18.14%) and cytoplasmic class (+17.24%),

while coarse speckled and centromere class ob-

tained slightly lower accuracies (respectively, -

6.48% and -4.85%).

4.3 Discussion

The results presented in Table 2 and 3 suggest that

the proposed algorithm, irrespective of the classiﬁ-

cation technique actually applied, is a good solution

for the automated classiﬁcation of immunoﬂuores-

cence cell patterns. As a matter of facts, the accuracy

rate is comparable to the one obtained by the special-

ists, whose inter-laboratory variability is generally as-

sessed around 10% or even higher (Egerer, 2010). Be-

sides that, differently from human operators, our tech-

nique provides fully-repeatable results that are based

on objective and quantitative features of the images.

As for the classiﬁcation techniques, the same re-

sults show that SDA technique, in combination with

ClassificationofHEp-2StainingPatternsinImmunoFluorescenceImages-ComparisonofSupportVectorMachinesand

SubclassDiscriminantAnalysisStrategies

a proper selection of the most relevant features, out-

performs the best accuracy achievable with SVM on

the same dataset (II) and even with those obtained by

SVM on dataset III, speciﬁcally optimized for that

technique with a two-step FS process. Therefore,

our experiments shows the capabilities of SDA to de-

scribe in a more precise way the underlying distribu-

tions of each of the staining pattern class, improving

their classiﬁcation accuracies.

5 CONCLUSIONS

In this paper we proposed the comparison of two ap-

proaches, based on SVM and SDA, for the automatic

classiﬁcation of staining patterns in HEp-2 cell IIF

images. Texture descriptors based on GLCM and

DCT coefﬁcients are ﬁrst exploited to extract a 372-

size characteristic vector for each cell. Then, a fea-

ture selection algorithm is applied to obtain a reduced

candidate feature set that improves the classiﬁcation

accuracies of the two methods.

Feature selection is based on the mRMR algo-

rithm, which sorts the features that are most rele-

vant for characterizing the classiﬁcation variable. The

50 top-ranked features were selected. In the case

of SVM-based method, a two-steps feature selection

procedure, coupling mRMR with SFS algorithm, is

implemented in order to further improve classiﬁcation

accuracies.

The two approaches provide average classiﬁca-

tion accuracies of about 87% and 91%, respectively.

These results are comparable with those of human

specialists. Conversely, they are completely repeat-

able since our automated technique does not depend

on the subjectivity of the operator. Moreover, our ex-

periments show the effectiveness of SDA into describ-

ing more precisely, compared to SVM, the underlying

distributions of each of the staining pattern class.

As future steps, we plan to work on:

1) a better characterization of cell patterns, which

can be insensitive to changes in size, rotation and in-

tensity;

2) an improvement of the SDA classiﬁer in terms

of computational efﬁciency. For this purpose, meth-

ods selecting a priori the classes that effectively needs

to be partitioned, like the one described in (Sang-

Woon Kim, 2010), will be investigated;

Moreover, we plan to develop a pipeline for auto-

matic cells segmentation in IIF images and to com-

bine it with our pattern classiﬁcation algorithm in or-

der to obtain a complete automated approach for the

computer-aided diagnosis (CAD) of autoimmune dis-

eases.

REFERENCES

Ahmed, N. and Natarajan, T. and Rao, K. R. Discrete

Cosine Transform. IEEE Trans. Computers, 90–93,

1974.

Belhumeur, P. N. and Hespanha, J. P. and Kriegman, D. J.

Eigenfaces vs. Fisherfaces: Recognition Using Class

Speciﬁc Linear Projection. IEEE Trans. Pattern Anal-

ysis and Machine Intelligence, vol. 19, no. 7, pp. 711-

720, July 1997.

Boulgouris, N. V. and Plataniotis, K. N. and Micheli-

Tzanakou, E. Discriminant Analysis for Dimensional-

ity Reduction: An Overview of Recent Developments.

In Biometrics: Theory, Methods, and Applications,

Wiley, 2009

Chang , C.-C. and Lin, C.-J. Libsvm: A library for support

vector machines. ACM Trans. Intell. Syst. Technol.,

2(3):27:1–27:27, May 2011.

Clausi, D. A., An analysis of co-occurrence texture statis-

tics as a function of grey level quantization. Can. J.

Remote Sensing 28(1):45–62, 2002.

Creemers, C. and Guerti, K. and Geerts, S. and Van Cot-

them, K. and Ledda, A. and Spruyt, V. HEp-2 cell

pattern segmentation for the support of autoimmune

disease diagnosis. ISABEL 2011, Proc. of, 28:1–5,

2011.

Di Cataldo, S. and Bottino, A. and Ficarra, E. and Macii,

E. Applying Textural Features to the Classiﬁcation

of HEp-2 Cell Patterns in IIF images. 21st Inter-

national Conference on Pattern Recognition (ICPR

2012), Tsukuba, Japan, November 11-15, 2012.

Egerer, K. and Roggenbuck, D. and Hiemann, R. and

Weyer, M. G. and Buttner, T. and Radau, B. and

Krause, R. and Lehmann, B. and Feist, E. and

Burmester, G. R. Automated evaluation of autoan-

tibodies on human epithelial-2 cells as an approach

to standardize cell-based immunoﬂuorescence tests.

Arthritis Research & Therapy, 12(2):1–9, 2010

Etemad, K. and Chellapa, R. Discriminant Analysis for

Recognition of Human Face Images. J. Optical Soc.

Am. A, vol. 14, no. 8, pp. 1724-1733, 1997.

Fukunaga, K. Introduction to Statistical Pattern Recogni-

tion. second ed. Academic Press, 1990.

Gkalelis, N. and Mezaris, V. and Kompatsiaris, I. Mixture

subclass discriminant analysis IEEE Signal Process-

ing Letters, vol. 18, no. 5, pp. 319-322, May 2011.

Haralick, R. M. and Shanmugam, K. and Dinstein, I.. Tex-

tural features for image classiﬁcation. Systems, Man

and Cybernetics, IEEE Transactions on, 3(6):610–

621, nov. 1973.

Hsieh, R. Y. and Huang, Y. C. and Chung, C. W. and Huang,

Y. L. HEp-2 Cell Classiﬁcation in Indirect Immuno-

fuorescence Images. ICICS 2009, Proc. of, 26:211–

214, 2009.

Kurgan, L. A. and Cios, K. J. Caim discretization algorithm.

IEEE Trans. on Knowl. and Data Eng., 16(2):145–

153, Feb. 2004.

Martinez, A. M. and Zhu, M. Where Are Linear Feature

Extraction Methods Applicable? IEEE Trans. Pattern

BIOINFORMATICS2013-InternationalConferenceonBioinformaticsModels,MethodsandAlgorithms

Analysis and Machine Intelligence, vol. 27, no. 12, pp.

1934-1944, Dec. 2005.

MIVIA Lab, http://nerone.diiie.unisa.it/zope/mivia/databas

es/db database/biomedical/ last accessed: September

2012.

Peng, H. and Long, F. and Ding, C. Feature selection based

on mutual information: criteria of max-dependency,

max-relevance, and min-redundancy. IEEE Trans

PAMI, 27:1226–1238, 2005.

Perner, P. and Perner, H. and Muller, B. Mining knowledge

for HEp-2 cell image classiﬁcation. Artiﬁcial Intelli-

gence in Medicine, 26:161 –173, 2002.

Sack, U. and Knoechner, S. and Warschkau, H. and Pigla,

U. and Emmrich, F. and Kamprad, M. Computer-

assisted classiﬁcation of hep-2 immunoﬂuorescence

patterns in autoimmune diagnostics. Autoimmunity

Reviews, 2(5):298304, 2003.

Kim, S.-W. A pre-clustering technique for optimizing sub-

class discriminant analysis, Pattern Recognition Let-

ters, Volume 31, Issue 6, pp. 462-468, 2010

Soda, P. and Iannello, G. A Hybrid Multi-Expert Sys-

tems for HEp-2 Staining Pattern Classiﬁcation. ICIAP

2007, Proc. of, 685–690, 2007.

Soh, L. and Tsatsoulis, C. Texture Analysis of SAR Sea

Ice Imagery Using Gray Level Co-Occurrence Matri-

ces IEEE Transactions on Geoscience and Remote

Sensing, 37(2), 1999.

Sorwar, G. and Abraham, A. and Dooley, L. S. Texture

Classiﬁcation Based on DCT and Soft Computing.

FUZZ-IEEE01, Proc. of, 2-5 Dec 2001, 2001. 2011.

Swets, D. L. and Weng, J. J. Using Discriminant Eigenfea-

tures for Image Retrieval. IEEE Trans. Pattern Anal-

ysis and Machine Intelligence, vol. 18, no. 8, pp. 831-

836, Aug. 1996.

Temko, A. and Camprubi, C. N. Classication of acoustic

events using SVM-based clustering schemes. Pattern

Recognition, 39(4): 682–694, 2006.

Tozzoli, R. and Bizzaro, N. and Tonutti, E. and Villalta, D.

and Bassetti, D. and Manoni, F. and Piazza, A. and

Pradella, M. and Rizzotti, P. Guidelines for the lab-

oratory use of autoantibody tests in the diagnosis and

monitoring of autoimmune rheumatic diseases. Am J

Clin Pathol, 117(2): 316–24, 2002.

Ververidis, D. and Kotropoulos, C. Fast and accurate feature

subset selection applied into speech emotion recogni-

tion. Els. Signal Process., 88(12): 2956–2970, 2008.

Weston, J. and Mukherjee, S. and Chapelle, O. and Pontil,

M. and Poggio, T. and Vapnik, V. Feature selection for

SVMs. Advances in Neural Information Processing

Systems 13, 668–674, 2000.

Zhu, M. and Martinez, A. M. Subclass Discriminant Anal-

ysis. IEEE Trans PAMI, 28(8): 1274–1286, 2006.

ClassificationofHEp-2StainingPatternsinImmunoFluorescenceImages-ComparisonofSupportVectorMachinesand

SubclassDiscriminantAnalysisStrategies