of genes that are suspected to play a role in the
disease. Here the goal is to find genes that could
help to improve prognosis and to select the most
suitable treatments according to the patients pro-
files.
The instances are described in Table 1 with their
number of observations, number of observations in
the positive group (the negative group is of course
the complement) and the maximal number of varia-
bles. For each instance, we consider consider diffe-
rent values x of variables in order to evaluate the per-
formance of the algorithms with regards to this num-
ber of variables.
Table 1: Characteristics of the instances.
Instances Obs Positive group size Var
Random 20 1 35
ra100 phv 100 21 50
ra100 phy 105 31 51
ralsto 73 27 23
ra phv 108 22 70
ra phy 112 31 73
ra rep1 112 38 155
ra rep2 112 37 73
rch8 132 5 37
vote r 435 168 16
cr60 289 58 14
os1 289 224 14
rel1 259 200 14
5.2 Results
Table 2 provides the results obtained on the instances
for computing the set of prime patterns. The first co-
lumn corresponds to the name of the instance, with
the number x of used variables (remind that we consi-
der different sizes for each the instances). The second
column corresponds to the number of prime patterns
for each instance. Next two columns are execution
time (in seconds) for our algorithm PPC 2 (Algorithm
4), and execution time (in seconds) for PPC 1 (Al-
gorithm 1). The last two columns correspond to the
maximal size of the computed patterns (in terms of
number of variables) and the execution time of PPC 1
used with an initial bound equal to this maximal size.
Of course, when using this bound we get the same re-
sults but PPC 1 is faster since it may stop earlier. Note
that in practice the value of the bound is not known
until the set of prime patterns has been computed.
Running time is limited to 24 hours. “-” corresponds
to execution times greater than this limit.
The execution time of PPC 1 increases as the number
of observations increases and especially as the num-
ber of variables increases. PPC 2 is less sensitive to
the number of observations. Nevertheless its compu-
tation time also increases according to the number of
variables.
Let us remark that PPC 1 is able to compute all
prime patterns for instance Random(35) in three days.
Nevertheless, one week is not enough for the instance
rch8(37). PPC 2 is able to compute all prime patterns
for instance ra rep2(65) in a bit more than one hour
while PPC 1 is not able to solve the same instance with
only 20 variables.
In PPC 1, the number of iterations is related to the
number of variables. We observe in the last column
that if a good bound is available (equal to the size
of the largest prime pattern), prime patterns can be
computed more efficiently. Nevertheless, execution
time is still high compared to PPC 2. Moreover, it
requires to know the size of the largest pattern.
6 CONCLUSION
In this paper we have defined a new algorithm to gene-
rate complete sets of prime patterns and strong prime
patterns in LAD context. Compared to the state of
the art algorithms for these problems, our algorithm
is now able to handle larger data sets. The main idea
of its resolution process is to use an extension of the
LAD (multiple characterization of data) in order to
first compute the non dominated solutions and finally
obtain all the prime and strong prime patterns. Expe-
riments show the efficiency of our algorithm in term
of running times and instance sizes.
REFERENCES
Berge, C. (1984). Hypergraphs: combinatorics of finite
sets, volume 45. Elsevier.
Boros, E., Crama, Y., Hammer, P. L., Ibaraki, T., Kogan, A.,
and Makino, K. (2011). Logical analysis of data: clas-
sification with justification. Annals OR, 188(1):33–61.
Boros, E., Hammer, P. L., Ibaraki, T., and Kogan, A. (1997).
Logical analysis of numerical data. Mathematical
Programming, 79(1-3):163–190.
Boros, E., Hammer, P. L., Ibaraki, T., Kogan, A., Mayoraz,
E., and Muchnik, I. (2000). An implementation of
logical analysis of data. IEEE Transactions on Know-
ledge and Data Engineering, 12(2):292–306.
Boureau, T., Kerkoud, M., Chhel, F., Hunault, G., Darrasse,
A., Brin, C., Durand, K., Hajri, A., Poussier, S., Man-
ceau, C., et al. (2013). A multiplex-pcr assay for iden-
tification of the quarantine plant pathogen xanthomo-
nas axonopodis pv. phaseoli. Journal of microbiologi-
cal methods, 92(1):42–50.
Chambon, A., Boureau, T., Lardeux, F., Saubion, F., and
Le Saux, M. (2015). Characterization of multiple
ICPRAM 2019 - 8th International Conference on Pattern Recognition Applications and Methods
218