to perform a specific function or just define the mini-
mal values of covering and accuracy of rules resulting
from mining algorithms.
4 EXPERIMENTAL RESULTS
In order to test AGMI, we used real data from Com-
prasNet – the Brazilian Federal Bidding online sys-
tem. The used data is relative to all bidding processes
of a specific type of service contracted by Federal
Executive agencies, between years 2005 and 2008,
including all states of Brazil (26 states + the Fed-
eral District). The database includes 26,615 records,
2,701 bidding processes and 3,051 companies. Each
record in the dataset represents one bid of a company
in a specific bidding process.
4.1 First Experiment
We started our experiments with AGMI using three
agents: the coordinator, the evaluator and the rule as-
sociation agent. The Rule Association Agent used the
Apriori algorithm, available in Weka framework. The
algorithm has been adapted to DMA in order to be
used in a multi-agent environment. In previous tests,
before AGMI prototype was implemented (Silva and
Ralha, 2010), we found that a rule with high lower
bound of support might just imply the presence of big
companies in bidding processes (frequent itemsets).
Thus, setting a high lower bound of support in this al-
gorithm can suppress the appearance of several good
rules, with real features of cartels. On the other hand,
high lower bound of confidence ensures the selection
of good rules.
With the help of experts, in our first experiment
with AGMI, we set the lower bound of support to
get rules with 9 occurrences on the database, and the
lower bound of confidence was set in 90%. We have
also defined Equation 1 for evaluating the rules ob-
tained through the DM process. This function was im-
plemented in the Evaluator Agent for measuring the
quality of the rules, and means the probability of one
company of the suspicious group winning a bidding
process. The higher the Rule Quality (RQ) value, the
more suspicious the group is of cartel practicing.
RQ = 100.
V (C)
Sup. × Inst.
(1)
In Equation 1, Sup. is the rule support value; Inst.
is the total number of instances; C is the company
set from rule; B is the bidding processes where the
company group C has participated; and V (C) is the
number of victories in B of any company in C.
We ran our tests using two computers: Host A (In-
tel Core 2, 2.40 GHz, 2.00 GB RAM) and Host B
(Intel Pentium Dual, 1.86 GHz, 2.00 GB RAM). The
coordinator and evaluator agent were set in Host A,
and the association rule agent was set in Host B. In
this experiment, we found 128 rules, and the execu-
tion time was 29 minutes. The top 100 rules scored
an average of 16.56 in RQ evaluation. The average of
support, on the other hand, was 30.98.
The top 10 rules scored an average of 33.00 in RQ
evaluation, and the average of support was 26.90. The
best rule, according to Equation 1 had 46 points of RQ
and 26 of support. As noticed, the top ten rules have
a smaller average support than the top 100 rules. Val-
idating the results with auditing experts, we conclude
that higher limits of support do not guarantee better
rules.
4.2 Second Experiment
In the First Experiment, with only one mining agent
in the system (Rule Association), we couldn’t set the
lower bound support to less than 24. This happened
due to the lack of memory resources available in our
machines. As lower the bound of support is set, as
more memory is consumed by the rule association al-
gorithm. Furthermore, it’s quite possible to have rules
with support less than 24 with real characteristics of
a cartel. Thus, we needed to introduce a strategy to
divide our space to seek the groups more accurately.
For this, clustering is an appropriate DM technique to
do such activity.
In addition to the Rule Association Agent we
have used Clustering Agent, which implemented the
Expectation-Maximization (EM) algorithm to dis-
cover clusters considering the companies and the
Brazilian states. According to (Han and Kamber,
2005), in EM algorithm, each object is assigned to
each cluster according to a weight representing its
probability of membership. For the experiments with
the Clustering Agent we have included it in Host A.
With the addition of the Clustering Agent, we
found the regions of public biddings, and the com-
panies that participated of those biddings in each re-
gion. Seven clusters of states were found, and, for
each cluster region, the rule association agent ran his
technique searching associated companies. The found
clusters were: 1 - {AM, PA, AC, RO, AP, RR, MT,
MS, RJ, ES and MG}; 2 - {RS, SC, PR}; 3 - {BA,
SE}; 4 - {PE, PB, RN, CE, PI, MA} 5 - {TO, DF,
GO}; 6 - {SP}; 7 - {AL}.
We found in this experiment 6,150 rules. In the
top 100 rules, we found an average of 69.71 in RQ,
and 9.78 of support. In the top 10 rules, the RQ
AGMI - AN AGENT-MINING TOOL AND ITS APPLICATION TO BRAZILIAN GOVERNMENT AUDITING
537