space than the space occupied by an item in hori-
zontol representation. Consequently, if the FOF ap-
proach were applied on the MULTI-WAP-Tree with-
out the sibling principle, both memory and time re-
quirements of FOF approach would be higher in cases
of large alphabets. A high degree of compression by
the MULTI-WAP-Tree is required to outperform Pre-
fixSpan and LAPIN in terms of both execution time
and memory. We observed that this requirement can-
not be met in case of large alphabets.
6 CONCLUSIONS AND FUTURE
WORK
In this paper, we introduced a new data structure
MULTI-WAP-Tree and a new algorithm MULTI-
FOF-SP for extracting multi-item sequence patterns.
MULTI-WAP-Tree is the first tree structure for repre-
senting general sequence databases. MULTI-FOF-SP
employs the early pruning idea Sibling Principle.
We have experimented on several test cases to
compare MULTI-FOF-SP with previous multi-item
sequence mining algorithms, PrefixSpan and LAPIN-
LCI. Experiments revealed that MULTI-FOF-SP out-
performs PrefixSpan and has a performance close
to LAPIN-LCI in terms of execution time on dense
multi-item databases with small alphabets. In addi-
tion, it has a better performance than LAPIN-LCI in
terms of memory usage for these databases.
In this work, we devised a MULTI-WAP-Tree
based algorithm that uses sibling principle and ob-
tained good results. As a continuation of this line,
other existing tree based algorithms can be inves-
tigated for multi-item sequence mining using the
MULTI-WAP-Tree data structure.
REFERENCES
Agrawal, R., Imelinski, T., and Swami, A. (1993). Min-
ing association rules between sets of items in large
databases. In Proceedings of the ACM SIGMOD
Conference on Management of Data, pages 207–216.
ACM.
Agrawal, R. and Srikant, R. (1995). Mining sequential pat-
terns. In Proceedings of the Eleventh International
Conference on Data Engineering (ICDE’95), pages
3–14. IEEE.
Ezeife, C. and Lu, Y. (2005). Mining web log sequen-
tial patterns with position coded pre-order linked
wap-tree. Data Mining and Knowledge Discovery,
10(1):5–38.
Han, J., Pei, J., and Yan, X. (2005). Sequential pat-
tern mining by pattern-growth: Principles and exten-
sions*. In Chu, W. and Lin, T., editors, Foundations
and Advances in Data Mining, volume 180 of Stud-
ies in Fuzziness and Soft Computing, pages 183–220.
Springer Berlin Heidelberg.
Liu, L. and Liu, J. (2010). Mining web log sequential pat-
terns with layer coded breadth-first linked wap-tree. In
International Conference of Information Science and
Management Engineering (ISME’2010), volume 1,
pages 28–31. IEEE.
Mabroukeh, N. and Ezeife, C. (2010). A taxonomy of se-
quential pattern mining algorithms. ACM Computing
Surveys (CSUR), 43(1):3.
Masseglia, F., Poncelet, P., and Cicchetti, R. (2000). An
efficient algorithm for web usage mining. Networking
and Information Systems Journal, 2(5/6):571–604.
Mooney, C. H. and Roddick, J. F. (2013). Sequential pattern
mining – approaches and algorithms. ACM Comput.
Surv., 45(2):19:1–19:39.
Pei, J., Han, J., Mortazavi-Asl, B., and Zhu, H. (2000). Min-
ing access patterns efficiently from web logs. Knowl-
edge Discovery and Data Mining. Current Issues and
New Applications, pages 396–407.
Peterson, E. and Tang, P. (2008). Mining frequent sequen-
tial patterns with first-occurrence forests. In Proceed-
ings of the 46th Annual Southeast Regional Confer-
ence (ACMSE), pages 34–39. ACM.
Song, S., Hu, H., and Jin, S. (2005). Hvsm: A new sequen-
tial pattern mining algorithm using bitmap representa-
tion. In Li, X., Wang, S., and Dong, Z., editors, Ad-
vanced Data Mining and Applications, volume 3584
of Lecture Notes in Computer Science, pages 455–
463. Springer Berlin Heidelberg.
Tang, P., Turkia, M., and Gallivan, K. (2006). Mining
web access patterns with first-occurrence linked wap-
trees. In Proceedings of the 16th International Confer-
ence on Software Engineering and Data Engineering
(SEDE’07), pages 247–252. Citeseer.
WEBIST2014-InternationalConferenceonWebInformationSystemsandTechnologies
222