STATISTICAL ASSOCIATIVE CLASSIFICATION OF

MAMMOGRAMS

The SACMiner Method

Carolina Y. V. Watanabe

1,2

Department of Informatics, Federal University of Rondˆonia, Porto Velho, RO, Brazil

Marcela X. Ribeiro

Computer Department, Federal University of S˜ao Carlos, S˜ao Carlos, SP, Brazil

Caetano Traina Jr., Agma J. M. Traina

Department of Computing, University of S˜ao Paulo, S˜ao Carlos, SP, Brazil

Keywords:

Statistical association rules, Computer-aided diagnosis, Associative classiﬁer, Breast cancer.

Abstract:

In this paper, we present a new method called SACMiner for mammogram classiﬁcation using statistical asso-

ciation rules. The method employs two new algorithms the StARMiner

∗

and the Voting classiﬁer (V-classiﬁer).

StARMiner

∗

mines association rules over continuous feature values, avoiding introducing bottleneck and in-

consistencies in the learning model due to a discretization step. The V-classiﬁer decides which class best rep-

resents a test image, based on the statistical association rules mined. The experiments comparing SACMiner

with other traditional classiﬁers in detecting breast cancer in mammograms show that the proposed method

reaches higher values of accuracy, sensibility and speciﬁcity. The results indicate that SACMiner is well-suited

to classify mammograms. Moreover, the proposed method has a low computation cost, being linear on the

number of dataset items, when compared with other classiﬁers. Furthermore, SACMiner is extensible to work

with other types of medical images.

1 INTRODUCTION

The technological progress on acquiring medical im-

ages increased the need of classiﬁcation methods to

speed-up and to assist the radiologists in the image

analysis task. Hence, there is an increasing need of

more accurate and low computational cost computer-

aided methods. In this scenario, new approaches have

been developed and employed in the computer-aided

diagnosis (CAD) ﬁeld. One of these approaches is

the association rule mining, which has become an ef-

fective way to develop classiﬁcation methods for en-

hancing the accuracy of medical image analysis. In

most of these approaches,images are submitted to im-

age processing algorithms to produce a feature vector

representation of them. The images, represented by

a set of continuous features, are submitted to associ-

ation rule mining algorithms to reveal their intra- and

inter-class dependencies. These rules are then em-

ployed for classiﬁcation. In general the association-

rule based approaches reach higher values of accu-

racy when compared to other rule-based classiﬁcation

methods (Dua et al., 2009).

In this paper, we present a new method, called

Statistical Associative Classiﬁer Miner (SACMiner),

for mammogram classiﬁcation using statistical asso-

ciation rules. The method employs statistical asso-

ciation rules to build a classiﬁcation model. First,

the images are segmented and submitted to a feature

extraction process. Each image is represented by a

vector of continuous visual features, as texture, shape

and color. In the training phase, statistical associa-

tion rules are mined relating continuous features and

image classes. The rules are mined using a new al-

gorithm called StARMiner*, which is based on the

feature selection algorithm StARMiner, proposed by

(Ribeiro et al., 2005), to produce more semantically

signiﬁcant patterns. StARMiner* does not require

121

Y. V. Watanabe C., X. Ribeiro M., Traina Jr. C. and J. M. Traina A. (2010).

STATISTICAL ASSOCIATIVE CLASSIFICATION OF MAMMOGRAMS - The SACMiner Method.

In Proceedings of the 12th International Conference on Enterprise Information Systems - Artiﬁcial Intelligence and Decision Support Systems, pages

121-128

DOI: 10.5220/0002970501210128

 SciTePress

the discretization step, like the other methods. This

avoids embedding the inconsistencies produced by

the discretization process in the mining process and

also, makes the whole process faster. In the test phase,

a voting classiﬁer decides which class best represents

a test image, based on the statistical association rules

mined. The experiments comparing SACMiner with

traditional classiﬁers show that the proposed method

reaches high values of accuracy, sensitivity and speci-

ﬁcity. These results indicate that SACMiner is well-

suited to classify mammograms. Another advantage

of SACMiner is that it builds a learning model that is

easy of understanding, making the user aware of why

an image was assign to a given class. Moreover, the

proposed method has a low computation cost (linear

on the number of dataset items) when compared to

other classiﬁers.

This paper is structured as follows. Section 2

presents concepts and previous work related to this

paper. Section 3 details the proposed method. Sec-

tion 4 shows the experiments performed to evaluated

the method. Finally, Section 5 gives the conclusion

and future directions of this work.

2 BACKGROUND AND RELATED

WORKS

The problem of mining association rules consists in

ﬁnding sets of items that frequently occurs together

in a dataset. It was ﬁrst stated in (Agrawal et al.,

1993) as follows. Let I = {i

, . . . , i

} be a set of liter-

als called items. A set X ∈ I is called an itemset. Let R

be a table with transactions t involving elements that

are subsets of I. An association rule is an expression

of the form X → Y, where X and Y are itemsets. X is

called body or antecedent of the rule, and Y is called

head or consequent of the rule.

Let |R| be the number of transactions in relation

R. Let |Z| be the total number of occurrences of the

itemset Z in transactions of relation R. The Support

and conﬁdence measures (Equations 1 and 2) are used

to determine the rules returned by the mining process.

Support =

|X ∪Y|

|R|

(1)

Confidence =

|X ∪Y|

|X|

(2)

The problem of mining association rules, as it was

ﬁrst stated, involves ﬁnding rules on a database of

categorical items that satisfy the restrictions of min-

imum support and minimum conﬁdence speciﬁed by

the user. This problem involves ﬁnding rules that

correlate categorical (nominal) data items. However,

images are represent by feature vectors of continu-

ous values. Thus, an approach that handles quan-

titative values should be more appropriated to work

with images. In (Aumann and Lindell, 1999; Ribeiro

et al., 2005; Srikant and Agrawal, 1996) procedures

for mining quantitative association rules, which relate

continuous-valued attributes, are presented.

In fact the association rules have been employed

in mining images using discrete and categorical at-

tributes. One of these works was presented in (Or-

donez and Omiecinski, 1999). In this work, a proce-

dure for discovering association rules in image con-

tent from a simple image dataset is presented. The im-

ages are previously segmented in blobs. The segmen-

tation process grouped pixels according to their sim-

ilarity. After these processes, a feature vector is gen-

erated to represent each blob. A similarity function is

applied to compare blobs from different images, and

if they are considered similar, they are represented

by the same object identiﬁer (OID). The OIDs from

the objects of each image compose the image records.

The image records are used to represent the images

during the mining process. An association rule min-

ing algorithm is applied to the image records, generat-

ing rules relating the object identiﬁers. The resulting

rules show the relationship between the most frequent

objects.

Works applying association rules to classify mam-

mograms were also developed showing promising re-

sults. In general, these methods have two mainly

phases: association rule mining and an associative

classiﬁer step. An associative classiﬁcation is a clas-

siﬁcation that uses a set of association rules as the

learning model. For example, (Wang et al., 2004) pre-

sented an association rule method to classify mammo-

grams based on categorical items. In this method, a

record combining three features of shape and the im-

age classiﬁcation is generated for each image. The

features are discretized in ten equal-sized intervals

in order to be applied to an association mining al-

gorithm. The rules are mined with the restriction of

not having a classiﬁcation item in the body part. A

new image is classiﬁed according to a kind of voting

classiﬁer, where the number of rules matched and the

conﬁdence of the rules is employed to decide which

class the test is. A drawback of this technique is the

discretization process, which may embed inconsisten-

cies in the data, reducing the accuracy of the classiﬁer.

In (Antonie et al., 2003), an associative classiﬁer

was presented to classify mammograms. In the pre-

processing phase, images are cropped and enhanced

using histogram equalization. Features of mean, vari-

ance, skewness and kurtosis are extracted from the

ICEIS 2010 - 12th International Conference on Enterprise Information Systems

122

images, and together with some other descriptors (e.g.

breast position and type of tissue), compose the im-

age records that are used in the process of association

rule-mining. The rules are mined using low conﬁ-

dence values and the classiﬁer label is restricted, so

that it occurs only in the head of the rules. The as-

sociative classiﬁer employed are based on the voting

strategy, i.e. the classiﬁer counts the number of rules

that a new image satisﬁes and chooses its class.

In (Ribeiro et al., 2009), a method that employs

association rules in a set of discretized features of

mammogram images was proposed. The method

uses a discretized feature vector and keywords from

the image diagnosis to compose the image register.

The training image registers were submitted to an

association-rule mining algorithm, restricting the key-

words to occur only in the head of the rule. The

mined rules were submitted to an associative classi-

ﬁer to give a score for each keyword. If the score is

greater than a given value the keyword is returned to

compose the diagnosis of the feature, otherwise the

keyword is discarded.

In (Dua et al., 2009), a method for the classiﬁ-

cation of mammograms was presented. The method

uses a weighted association-rule based classiﬁer.

First, the images are preprocessed and from each re-

gion of interest texture features were extracted. Sec-

ond, the features are discretized and submitted to an

association-rule algorithm. The produced rules are

employed for mammogram classiﬁcation. In fact,

most works in literature require the discretization of

continuous data before applying the association rule

mining.

In this work, we propose to employ statistical as-

sociation rules to improve computer-aided diagno-

sis system without depending on discretized features.

Our method, called SACMiner, suggests a second

opinion to the radiologists. Two algorithms were de-

veloped to support the method. The ﬁrst one is the

Statistical Association Rule Miner

∗

(StARMiner

∗

which mines rules selecting the features that best rep-

resent the images. The second algorithm is the Voting

Classiﬁer (V-Classiﬁer), which uses the rules mined

by the StARMiner

∗

to classify images. To validate the

proposed method, we performed experiments using

two different datasets of breast cancer, and we com-

pared SACMiner with well-known classiﬁers from lit-

erature. The results indicate that the statistical associ-

ation rules approach presents high-quality in the task

of diagnosing medical images.

3 PROPOSED METHOD:

SACMiner

The proposed method employs statistical association

rules to suggest diagnosis of medical images. The

method selects features that best discriminate images

into categorical classes. It avoids the discretization

step, which is necessary in most association rules al-

gorithms, reducing the complexity of the subsequent

steps of the method. Also, the method promotes an

easier comprehension of the learning model, making

it easy to understand the process of classiﬁcation.

The pipeline and the algorithm of the proposed

method are presented in Figure 1 and Algorithm 1, re-

spectively. The method works in two phases: training

and test. In the ﬁrst one, features are extracted from

the images and place in the correspondingfeature vec-

tors. This step includes the image pre-processing.

After that, the feature vectors are the entry for the

SACMiner method. Two algorithms were developed

to support the method: the StARMiner

∗

and the Vot-

ing classiﬁer (V-classiﬁer). StARMiner

∗

uses the fea-

ture vectors and the classes of the training images to

perform statistical association rule mining. It selects

the most meaningful features and produces the sta-

tistical association rules. In the test phase, the feature

vectors from the test images are extracted and submit-

ted to the V-classiﬁer, which uses the statistical asso-

ciation rules produced by the StARMiner

∗

to suggest

a diagnosis class for the test image. We discuss each

step of the SACMiner method in the following sub-

sections.

F t

SACMINER

ure

extraction

StARMiner*

Im1 0.13,0.28,0.24

Feature Vector

mages

Pre

Processing

C1-> 1[-0.012,0.178],

6[-0.0075,0.1825]

> 2[0 2973 0 4873]

Pre

Processing

TRAINING

> 2[0

2973

4873]

Statistical Association

Rules

F t

Classification

ure

extraction

Im2 0.17,0.48,0.26

Feature Vector

Images

“Second

Opinion

”

TEST

Images

Pre-Processing

Opinion

TEST

Figure 1: Proposed method.

3.1 The StARMiner

∗

Algorithm

StARMiner

∗

is a supervised classiﬁcation model

whose goal is to ﬁnd statistical association rules over

the feature vectors extracted from images, providing

the attributes that best discriminate images into cate-

STATISTICAL ASSOCIATIVE CLASSIFICATION OF MAMMOGRAMS - The SACMiner Method

123

gorical classes. It returns rules relating feature inter-

vals and image classes.

Formally, let us consider x

an image class and

an image feature (attribute). Let V

min

and V

max

be the limit values of an interval. A rule mined by

StARMiner

∗

algorithm has the form:

min

max

] → x

(3)

An example of rule mined by StARMiner

∗

5[−0.07, 0.33] → benignant mass

This rule indicates that images having the 5

fea-

ture value in the closed interval [-0.07,0.33]tend to be

images of benignant masses. Algorithm 2 shows the

main steps of StARMiner

∗

To performthe association rule mining, the dataset

under analysis is scanned just once. StARMiner

∗

cal-

culates the mean and the standard deviation for each

attribute and the Z value, used in the hypotheses test.

Two restrictions of interest in the mining process must

be satisﬁed. The ﬁrst restriction is that the feature

must have a behavior in images from class x

dif-

ferent from its behavior in images from the other

classes. The second restriction is that the feature f

must present a uniform behavior in every image from

class x

The restrictions of interest are processed in line

7. Let T be the image dataset, x

an image class,

∈ T the subset of image class x

, and f

the i

fea-

ture of the feature vector. Let µ

) and σ

) be,

respectively, the mean and the standard deviation of

feature f

in images from class x

; µ

(T − T

) and

(T − T

) corresponds to, respectively, the mean

and the standard deviation of feature f

values of the

images that are not from class x

A rule f

min

max

] → x

is computed by the al-

gorithm, only if the rule satisﬁes the input thresholds:

∆µ

min

, σ

max

and γ

min

• ∆µ

min

is the minimum allowed difference between

the average of the feature f

in the images from

class x

and the remaining images in the dataset;

Algorithm 1: Steps of the proposed method.

Input: Training images, a test image

Output: Report (class of the image test).

1: Extract features of the training images

2: Execute StARMiner

∗

algorithm to mine associa-

tion rules

3: Extract features of the test image

4: Execute the Classiﬁer

5: Return the suggested report (class)

• σ

max

is the maximum standard deviation of f

val-

ues allowed in the class x

;

• γ

min

is the minimum conﬁdence to reject the hy-

pothesis H

. The hypothesis H

states that the

mean of f

values inside and outside the class x

are statistically equal:

H0 : µ

) = µ

(T − T

). (4)

The values of V

min

and V

max

are compute as:

min

= µ

− σ

max

(5)

max

= µ

+ σ

max

(6)

StARMiner

∗

has the interesting property that the

maximum number of rules mined by a class x

is the

total number k of image features.

The complexity of this algorithm is Θ(ckN),

where N is the number of instances of the dataset, k is

the number of features and c is the number of classes.

StARMiner

∗

is based on the idea of the feature

selection algorithm StARMiner. The main difference

between StARMiner and StARMiner

∗

algorithms is

that the second has the advantage of mining more se-

mantically signiﬁcant rules. While StARMiner only

relates classes to features that best discriminate them,

StARMiner

∗

ﬁnds rules relating class and the feature

intervals where particular behavior has occurred.

Algorithm 2: The StARMiner

∗

algorithm.

Input: Database T: table of feature vectors {x

, f

,..., f

}, where x

is the image class and f

are

the image features; thresholds ∆µ

min

, σ

max

and γ

min

Output: Mined rules

1: Scan database T;

2: for each class x

3: for each feature f

4: Compute µ

) and µ

(T − T

);

5: Compute σ

) and σ

(T − T

);

6: Compute Z value Z

;

7: if (µ

) − µ

(T − T

)) ≥ ∆µ

min

AND σ

) ≤ σ

max

AND (Z

< Z

OR Z

> Z

) then

8: Write x

→ f

[µ

− σ

max

, µ

+ σ

max

];

9: end if

10: end for

11: if any rule is found then

12: Choose the feature f

which Z value is

the biggest

13: Write f

[µ

− σ

max

, µ

+ σ

max

] → x

;

14: end if

15: end for

ICEIS 2010 - 12th International Conference on Enterprise Information Systems

124

3.2 The Proposed Classiﬁer

We develop a classiﬁer that uses the mined rules by

StARMiner

∗

. The main idea is counting ‘votes’. For

each class, we count the number of rules that are sat-

isﬁed. This counting is normalized by the number of

rules of the class. The output is the class that obtain

more votes. Algorithm 3 shows the algorithm of the

V-Classiﬁer method.

Algorithm 3: The V-classiﬁer.

Input: Mined Rules in the form f

[µ

− σ

max

, µ

max

] → x

, and a feature vector g from a test im-

age, where g

are the features

Output: Report (class of the image test).

1: for each class x

2: vote

=0;

3: for each feature f

4: if g

is in [µ

− σ

max

, µ

+ σ

max

] then

5: vote

= vote

+ 1;

6: end if

7: end for

8: Divide vote

by number of rules of the

class x

;

9: end for

10: Return the class of max(vote

We can observe that the computational cost of

SACMiner is low, since StARMiner

∗

is linear on the

number of images (dataset items) and the V-Classiﬁer

is linear on the number of rules. The low computa-

tional cost of the method is stressed by the fact that

StARMiner

∗

has the property that the maximum num-

ber of rules mined by a class x

is the total number k

of image features.

4 EXPERIMENTS

We performed several experiments to validate the

SACMiner method. Here, we present two of them

in the task of suggesting diagnosis for Regions Of

Interest (ROIs) of mammograms, considering benig-

nant and malignant masses. We use two different ap-

proaches. In the ﬁrst one, the experiments were per-

formed using the holdout approach, in which we em-

ployed 25% of the images from the datasets for test-

ing and the remaining images for training. The second

approach was the leave-one-out.

To show the efﬁcacy of this method, we compare it

with well knownclassiﬁers: 1-NN, C4.5, Naive Bayes

and 1R. The 1-nearest neighbor (1-NN) is a classiﬁer

that uses the class label of the nearest neighbor to clas-

sify a new instance, using the Euclidean distance. The

C4.5 (Quinlan, 1993) is a classiﬁer that builds a de-

cision tree in the training phase. The Naive Bayes

(Domingos and Pazzani, 1997) is a classiﬁer that uses

a probabilistic approach based on the Bayes theorem

to predict the class labels. And ﬁnally, the last one, 1R

(Holte, 1993), is a classiﬁer based on rules that clas-

sify an object/image on the basis of a single attribute

(they are 1-level decision trees); it involves discrete

attributes.

To compare the classiﬁers, we compute measures

of accuracy, sensitivity and speciﬁcity. The accuracy

is the portion of cases of the test dataset that were

correctly classiﬁed. The sensitivity is the portion of

the positive cases that were correctly classiﬁed. And

the speciﬁcity is the portion of the negative cases that

were correctly classiﬁed. An optimal prediction can

achieve 100% sensitivity (i.e. predict all images from

the malignant group as malignant) and 100% speci-

ﬁcity (i.e. not predict any image from the benignant

class as malignant). To compute these measures, let

us considering the following cases:

• True positive: malignant masses correctly diag-

nosed as malignant;

• False positive: benignant masses incorrectly iden-

tiﬁed as malignant;

• True negative: benignant masses correctly identi-

ﬁed as benignant;

• False negative: malignant masses incorrectly

identiﬁed as benignant.

Let the number of true positives be TP, the number

of false positive be FP, the number of true negative be

TN and the number of false negative be FN. Equations

7, 8 and 9 present the formula of accuracy, sensitivity

and speciﬁcity, respectively.

accuracy =

TP+ TN

TP+ TN + FP+ FN

(7)

sensitivity =

TP+ FN

(8)

speciﬁcity =

TN + FP

(9)

Experiment 1: the 250 ROIs Dataset. This dataset

consists of 250 ROIs taken from mammograms col-

lected from the Digital Database for Screening Mam-

mography - DDSM dataset

. The dataset is composed

of 99 benignant and 151 malignant mass images.

http://marathon.csee.usf.edu/Mammography/Database

.html

STATISTICAL ASSOCIATIVE CLASSIFICATION OF MAMMOGRAMS - The SACMiner Method

125

(a) (b) (c)

Figure 2: (a) Original image. (b) Image segmented in 5

regions. (c) Mask of the main region.

In the image pre-processing step, the images were

segmented using an improved EM/MPM algorithm

proposed in (Balan et al., 2005). This algorithm seg-

ments the images using a technique that combines a

Markov Random Field and a Gaussian Mixture Model

to obtain a texture-based segmentation. The segmen-

tation of images are accomplished accordingto a ﬁxed

number of different texture regions. In this experi-

ment, we segmented the images in ﬁve regions. After

the segmentation step, the main region is chosen for

the feature extraction. This choice is based on the vi-

sual characteristics that all these ROIs are centered.

Hence, our algorithm uses the centroid of the image

to choose the main region. The Figure 2 illustrates the

pre-processing step.

For the segmented region, eleven features based

on the shape are extracted: area, major axis length,

minor axis length, eccentricity, orientation, convex

area, ﬁlled area, Euler number, solidity, extent and

perimeter. It is important to highlight that the feature

vector generated is quite compact.

In step 2, the feature vectors from the training im-

ages set were submitted to StARMiner

∗

to mine sta-

tistical association rules. This algorithm mined the

following rules:

A[−0.0120, 0.1770] → Benignant (10)

C[− 0.0075, 0.1825] → Benignant (11)

F[−0.0133, 0.1767] → Benignant (12)

L[0.2973, 0.4873] → Malignant (13)

In these rules, A represents the feature of tu-

mor mass area; C, the convex area feature; F, the

ﬁlled area feature; and L, the major axis length fea-

ture. These rules mean that masses whose area are

in the interval [-0.0120,0.1770], convex area in [-

0.0075,0.1825] and ﬁlled area in [-0.0133,0.1767]

tend to be benignant. On the other hand, masses

whose major axis length is in [0.2973,0.4873] tend to

be malignant. For this experiment, we considered an

conﬁdence level of 90% to the Z-test and to compute

the intervals of rules.

The four mined rules and the feature vectors of

the test images were introduced to the classiﬁer. The

results using the holdout and the leave-one-out ap-

proaches are shown in the Tables 1 and 2, respec-

tively.

Table 1: Comparison between SACMiner and other well-

known classiﬁers using the holdout approach.

Classiﬁers Accuracy Sensitivity Speciﬁcity

SACMiner 0.8548 0.8461 0.8611

1R 0.7258 0.8260 0.6666

Naive Bayes 0.6290 0.9130 0.4615

C4.5 0.7585 0.7391 0.7692

1-NN 0.6129 0.6521 0.5897

Table 2: Comparison among SACMiner and other well-

known classiﬁers using the leave-one-out approach.

Classiﬁers Accuracy Sensitivity Speciﬁcity

SACMiner 0.7680 0.7788 0.7603

1R 0.7680 0.7885 0.7534

Naive Bayes 0.7360 0.8750 0.6370

C4.5 0.7440 0.6154 0.8356

1-NN 0.6760 0.6154 0.7192

Analyzing Table 1, we observe that SACMiner

presented the highest values of accuracy and speci-

ﬁcity in the holdout approach. When we analyze the

sensitivity, we can note that Naive Bayes obtained

the best result. However, when we analyze it with

its speciﬁcity, we observe that Naive Bayes has a low

power to classify the benignant images.

In Table 2, SACMiner leaded to the highest value

of accuracy together to the 1R Classiﬁer. In this case,

the association rule approach is the best one to clas-

sify masses. One advantage of SACMiner over 1R is

that SACMiner does not demands the data discretiza-

tion step. Besides, SACMiner produced just four

rules, while 1R produced eight. All the rules mined

by 1R were from the feature major axis length (L),

the second attribute of the feature vector, and they are

describe as:

if L < 0.1840 then Benignant (14)

else if L < 0.2181 then Malignant (15)

else if L < 0.2367 then Benignant (16)

else if L < 0.2572 then Malignant (17)

else if L < 0.2716 then Benignant (18)

else if L < 0.3126 then Malignant (19)

else if L < 0.3424 then Benignant (20)

else if L ≥ 0.3424 then Malignant. (21)

Experiment 2: the 569 ROIs Dataset. This dataset

consists of 569 feature vectors obtained from the UCI

ICEIS 2010 - 12th International Conference on Enterprise Information Systems

126

Machine Learning Repository (Asuncion and New-

man, 2007)

. These features describe characteristics

of the cell nuclei present in the image. Features were

computed from breast masses and they are classiﬁed

in benignant and malignant masses. For each of the

three cell nucleus, the following ten features were

computed: mean of distances from center to points

on the perimeter, standard deviation of gray-scale val-

ues, perimeter, area, smoothness, compactness, con-

cavity, concave points, symmetry and fractal dimen-

sion. Thus, the feature vectors have 30 features, and

the classes are distributed in 357 benignant and 212

malignant.

In the step 2, StARMiner

∗

mined 19 rules for each

class. The results using the holdout and the leave-

one-out approaches are shown in Tables 3 and 2 4,

respectively.

Table 3: Comparison etween SACMiner and other well-

known classiﬁers using the holdout approach.

Classiﬁers Accuracy Sensitivity Speciﬁcity

SACMiner 0.9859 0.9888 0.9811

1R 0.8943 0.9186 0.8571

Naive Bayes 0.9155 0.9186 0.9107

C4.5 0.9295 0.9419 0.9107

1-NN 0.9577 0.9767 0.9286

Table 4: Comparison between SACMiner and other well-

known classiﬁers using the leave-one-out approach.

Classiﬁers Accuracy Sensitivity Speciﬁcity

SACMiner 0.9525 0.9860 0.8962

1R 0.9015 0.9356 0.8443

Naive Bayes 0.9349 0.9580 0.8962

C4.5 0.9384 0.9524 0.9151

1-NN 0.9525 0.9580 0.9434

When we analyze the results using the holdout ap-

proach in Table 3, we can note that SACMiner leads

the highest values of accuracy, sensitivity and speci-

ﬁcity. Thus, when we consider the results using the

leave-one-out approach, we also observe that the ac-

curacy measure is one of the highest, presenting the

same result that 1-NN, and leads the value of sensi-

tivity.

5 CONCLUSIONS

In this paper we proposed SACMiner, a new method

that employs statistical association rules to support

computer-aided diagnosis for breast cancer. The re-

sults of using real datasets show that the proposed

method achieves the highest values of accuracy, when

http://archive.ics.uci.edu/ml/datasets.html

compared with other well-known classiﬁers (1-R,

Naive Bayes, C4.5 and 1-NN). Moreover, the method

shows a proper balance between sensitivity and speci-

ﬁcity, being a little bit more speciﬁc than sensitive,

what is desirable in the medical domain, since it is

more accurate to spot the true positives. Two new

algorithms were developed to support the method,

StARMiner

∗

and V-Classiﬁer. StARMiner

∗

does not

demands the discretization step and generates a com-

pact set of rules to compose the learning model of

SACMiner. Moreover, the computational cost is low

(linear on the number of dataset items). V-classiﬁer

is an associative classiﬁer that works based on the

idea of classes votes. The experiments showed that

the SACMiner method produces high values of ac-

curacy, sensitivity and speciﬁcity when compared to

other traditional classiﬁers. In addition, SACMiner

produces rules that allow the comprehension of the

learning process, and consequently, it makes the sys-

tem more reliable to be used by the radiologists, since

they can understand the whole process of classiﬁca-

tion.

ACKNOWLEDGEMENTS

We are thankful to CNPq, CAPES, FAPESP, Univer-

sity of S˜ao Paulo and Federal University of Rondˆonia

for the ﬁnancial support.

REFERENCES

Agrawal, R., Imielinski, T., and Swami, A. N. (1993). Min-

ing association rules between sets of items in large

databases. In Proceedings of the 1993 ACM SIGMOD

ICMD, pages 207–216, Washington, D.C.

Antonie, M.-L., Zaane, O. R., and Coman, A. (2003). Asso-

ciative classiﬁers for medical images. In LNAI 2797,

MMCD, pages 68–83. Springer-Verlag.

Asuncion, A. and Newman, D. (2007). ”UCI machine learn-

ing repository”.

Aumann, Y. and Lindell, Y. (1999). A statistical theory

for quantitative association rules. In Press, A., edi-

tor, The ﬁfth ACM SIGKDD international conference

on Knowledge discovery and data mining, pages 261–

270, San Diego, California, United States.

Balan, A. G. R., Traina, A. J. M., Traina Jr., C., and Mar-

ques, P. M. d. A. (2005). Fractal analysis of im-

age textures for indexing and retrieval by content. In

18th IEEE Intl. Symposium on Computer-Based Med-

ical Systems - CBMS, pages 581–586, Dublin, Ireland.

IEEE Computer Society.

Domingos, P. and Pazzani, M. (1997). On the optimality

of the simple Bayesian classiﬁer under zero-one loss.

Machine Learning, 29(2-3):103–130.

STATISTICAL ASSOCIATIVE CLASSIFICATION OF MAMMOGRAMS - The SACMiner Method

127

Dua, S., Singh, H., and Thompson, H. W. (2009). Asso-

ciative classiﬁcation of mammograms using weighted

rules. Expert Syst. Appl., 36(5).

Holte, R. C. (1993). Very simple classiﬁcation rules per-

form well on most commonly used datasets. Machine

Learning, 11:63–91.

Ordonez, C. and Omiecinski, E. (1999). Discovering asso-

ciation rules based on image content. In IEEE Forum

on Research and Technology Advances in Digital Li-

braries (ADL ’99), pages 38 – 49, Baltimore, USA.

Quinlan, R. (1993). C4.5: Programs for Machine Learning.

San Mateo, CA. Morgan Kaufmann.

Ribeiro, M. X., Balan, A. G. R., Felipe, J. C., Traina, A.

J. M., and Traina Jr., C. (2005). Mining Statistical As-

sociation Rules to Select the Most Relevant Medical

Image Features. In First International Workshop on

Mining Complex Data (IEEE MCD’05), pages 91–98,

Houston, USA. IEEE Computer Society.

Ribeiro, M. X., Bugatti, P. H., Traina, A. J. M., Traina Jr.,

C., Marques, P. M. A., and Rosa, N. A. (2009). Sup-

porting content-based image retrieval and computer-

aided diagnosis systems with association rule-based

techniques. Data & Knowledge Engineering.

Srikant, R. and Agrawal, R. (1996). Mining quantitative

association rules in large relational tables. In ACM

SIGMOD International Conference on Management

of Data, pages 1–12, Montreal, Canada. ACM Press.

Wang, X., Smith, M., and Rangayyan, R. (2004). Mammo-

graphic information analysis through association-rule

mining. In IEEE CCGEI, pages 1495–1498.

ICEIS 2010 - 12th International Conference on Enterprise Information Systems

128