Authors:
Kezban Dilek Onal
and
Pinar Karagoz
Affiliation:
Middle East Technical University, Turkey
Keyword(s):
WAP-Tree (Web Access Pattern Tree), Sequential Pattern Mining, FOF (First Occurrence Forest), Sibling
Principle, Web Usage Mining.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Data Mining
;
Databases and Information Systems Integration
;
Enterprise Information Systems
;
Sensor Networks
;
Signal Processing
;
Soft Computing
Abstract:
Sequential pattern mining constitutes a basis for solution of problems in web mining, especially in web usage
mining. Research on sequence mining continues seeking faster algorithms. WAP-Tree based algorithms
that emerged from the web usage mining literature have shown a remarkable performance on single-item
sequence databases. In this study, we investigate the application of WAP-Tree based mining to multi-item sequential
pattern mining and we present MULTI-WAP-Tree, which extends WAP-Tree for multi-item sequence
databases. In addition, we propose a new algorithm MULTI-FOF-SP (MULTI-FOF-Sibling Principle) that
extracts patterns on MULTI-WAP-Tree. MULTI-FOF-SP is based on the previous WAP-Tree based algorithm
FOF (First Occurrence Forest) and an early pruning strategy called ”Sibling Principle” from the literature.
Experimental results reveal that MULTI-FOF-SP finds patterns faster than PrefixSpan on dense multi-item
sequence databases with small alphabets.