
upper bound of the number of frequent patterns
(with minimal frequency allowed = 1) is
M
UP
)K/11(NL +≈ ,
but usually it is less.
4.3 Example of results of MONSA
Extracted patterns (dicliques) from initial data
matrix (see 1.2) with support T>20%:
V1.2&V5.1=4 (V1 equal to 2 and V5 equal to 1; its
frequency equal to 4)
V1.2&V5.1&V2.1&V4.1=3
V1.2&V5.1&V2.1&V4.1&V3.2=2
V1.2&V5.1&V3.1=2
V2.1&V4.1=4
V2.1&V4.1&V3.2=3
V3.2&V1.1&V5.2=2
V2.2&V4.2=2
Sure, the table is small, but the general idea has
been presented.
4.4 Advantages of the algorithm
General properties of the algorithm are as follows:
• The number of results (patterns) can be
controlled via pruning with the T-level
• Several pruning criteria can be used
• Large datasets can be treated easily
• For every pattern its frequency is known at the
moment it is found, also other parameters based
on frequencies can be calculated
• It enables variables having a set of discrete
values (not only binary data!).
5 CONCLUSION
We have developed an effective pattern mining
algorithm on the basis of clique extracting algorithm
using Monotone Systems Theory. It does not use
candidate variables combining for pattern
description, it treats a pattern as a diclique.
Algorithm extracts only really existing in the data
matrix patterns and uses simple techniques to avoid
repetitive extracting of patterns. We implemented
this algorithm to create a method named Hypotheses
Generator for fast generating of association rules
(Kuusik et al., 2003). In the future we hope to find
effective pruning measures to restrict the number of
association rules.
REFERENCES
Agrawal, R., Srikant, R., 1994. Fast algorithms for mining
association rules. In VLDB’94, pp. 487-499
Bertin, J., 1981. Graphics and Graphic Information-
Processing. Walter de Gruyter, Berlin New York
Dunham, M. H., 2002. Data Mining: Introductory and
Advanced Topics. Prentice Hall
Fayyad, U. M., Piatetsky-Shapiro, G., Smyth, P., 1996.
From Data Mining to Knowledge Discovery: An
Overview. In Fayyad, U. M., Piatetsky-Shapiro, G.,
Smyth, P., Uthurusamy, R.; Advances in Knowledge
Discovery and Data Mining. AAAI Press/ The MIT
Press, pp.1-36
Hand, D., Mannila, H., Smyth, P., 2001. Principles of
Data Mining. MIT Press
Haralick, R.M., 1974. The Diclique Representation and
Decomposition of Binary Relations. In JACM, 21,3,
pp. 356-366
Hastie, T., Tibshirani, R., Friedman, J. H., 2001. The
Elements of Statistical Learning: Data Mining,
Inference, and Prediction (Springer Series in
Statistics), Springer Verlag
Kuusik, R., 1993. The Super-Fast Algorithm of
Hierarchical Clustering and the Theory of Monotone
Systems. In Transactions of Tallinn Technical
University, No 734, pp. 37-62
Kuusik, R., 1995. Extracting of all maximal cliques:
monotonic system approach. In Proc. of the Estonian
Academy of Sciences. Engineering, N 1, lk. 113-138
Kuusik, R., Lind, G., 2003. An Approach of Data Mining
Using Monotone Systems. In Proceedings of the Fifth
ICEIS. Vol. 2, pp. 482-485
Lin, D.-I., Kedem, Z. M., 1998. Pincer-Search: A New
Algorithm for Discovering the Maximum Frequent
Set. In Proc. of the Sixth European Conf. on Extending
Database Technology
Mullat, I., 1976. Extremal monotone systems. In
Automation and Remote Control, 5, pp. 130-139; 8,
pp. 169-178 (in Russian)
Park, J. S., Chen, M.-S., Yu, P. S., 1996. An Effective
Hash Based Algorithm for Mining Association Rules.
In Proc. of the 1995 ACM-SIGMOD Conf. on
Management of Data, pp. 175-186
Võhandu, L., 1981. Monotone Systems of Data Analysis.
In Transactions of TTU, No 511, pp. 91-100 (in
Russian)
Võhandu, L., 1989a. Fast Methods in Exploratory Data
Analysis. In Transactions of TTU, No 705, pp. 3-13
Võhandu, L., 1989b. A Method for Automatic Generation
of Statements from Examples. In Proceedings of the
Second Scaninavian Conference on Artificial
Intelligence (SCAI ’89), ed. H. Jaakkola, Tampere,
Finland, pp. 185-191.
ICEIS 2004 - ARTIFICIAL INTELLIGENCE AND DECISION SUPPORT SYSTEMS
522