A PROJECTION-BASED HYBRID SEQUENTIAL PATTERNS MINING ALGORITHM

Chichang Jou

Abstract

Sequential pattern mining finds frequently occurring patterns of item sequences from serial orders of items in the transaction database. The set of frequent hybrid sequential patterns obtained by previous researches either is incomplete or does not scale with growing database sizes. We design and implement a Projection-based Hybrid Sequential PAttern Mining algorithm, PHSPAM, to remedy these problems. PHSPAM first builds Supplemented Frequent One Sequence itemset to collect items that may appear in frequent hybrid sequential patterns. The mining procedure is then performed recursively in the pattern growth manner to calculate the support of patterns through projected position arrays, projected support arrays, and projected databases. We compare the results and performances of PHSPAM with those of other hybrid sequential pattern mining algorithms, GFP2 and CHSPAM.

References

  1. Agrawal, R., Imielinski, T., Swami, A., 1993. Mining association rules between sets of items in large databases. In Proc. of the 1993 ACM SIGMOD International Conference on Management of Data, Washington D.C., U.S.A., pp. 207-216.
  2. Agrawal, R., Srikant, R., 1994. Fast algorithm for mining association rules. In Proc. of the 20th International Conference on VLDB, Santiago, pp. 487-499.
  3. Agrawal, R., Srikant, R., 1995. Mining sequential patterns. In Proc. of the 11th International Conference on Data Engineering, Taipei, Taiwan, pp. 3-14.
  4. Agrawal, R., Srikant, R., 1996. Mining sequential patterns: generalizations and performance improvements. In Lecture Notes in Computer Science, Vol.1057, pp. 3-17.
  5. Chen, M., Park, J.S., and Yu, P.S., 1998. Efficient data mining for path traversal patterns. IEEE Trans. Knowledge Data Engineering, Vol. 10(2), pp. 209-221.
  6. Chen, Y.L., Chen, S.S., Hsu, P.Y., 2002. Mining hybrid sequential patterns and sequential rules. Information Systems, Vol. 27(5), pp. 345-362.
  7. Han, J., Pei, J., Mortazavi-Asl, B., Chen, Q., Dayal, U., Hsu, M.C., 2000a. Freespan: frequent patternprojected sequential pattern mining. In Proc. of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, U.S.A., pp. 355-359.
  8. Han, J., Pei, J., Yin, Y.W., 2000b. Mining frequent patterns without candidate generation. In Proc. of the 2000 ACM SIGMOD International Conference on Management of Data, New York, U.S.A. pp. 1-12.
  9. Jou, C., 2006. Mining Complete Hybrid Sequential Patterns. In Proc. of the DMIN 2006 International Conference on Data Mining, pp. 218-223, Las Vegas, USA, June 26-29.
  10. Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.C., 2001. PrefixSpan: mining sequential patterns efficiently by prefix projected pattern growth. In Proc. of the 17th International Conference on Data Engineering, Heidelberg, Germany, pp. 106-115.
  11. Pei, J., Han, J., Mortazavi-Asl, B., Wang, J., Pinto, H., Chen, Q., Dayal, U., Hsu, M.C., 2004. Mining sequential patterns by pattern growth: the PrefixSpan approach. IEEE Trans. on Knowledge and Data Engineering, Vol. 16(10), pp. 1-17.
  12. Pei, J., Han, J., Mortazavi-Asl, B., Zhu, H., 2000. Mining access patterns efficiently from web logs. In Proc. of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Kyoto, Japan, pp. 396-407.
  13. Zaki, M. J., 2001. SPADE: an efficient algorithm for mining frequent sequences. Machine Learning, Special Issue on Unsupervised Learning, Vol.42(1-2), pp. 31-60.
Download


Paper Citation


in Harvard Style

Jou C. (2009). A PROJECTION-BASED HYBRID SEQUENTIAL PATTERNS MINING ALGORITHM . In Proceedings of the 11th International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 978-989-8111-85-2, pages 152-157. DOI: 10.5220/0001986001520157


in Bibtex Style

@conference{iceis09,
author={Chichang Jou},
title={A PROJECTION-BASED HYBRID SEQUENTIAL PATTERNS MINING ALGORITHM},
booktitle={Proceedings of the 11th International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2009},
pages={152-157},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001986001520157},
isbn={978-989-8111-85-2},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 11th International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - A PROJECTION-BASED HYBRID SEQUENTIAL PATTERNS MINING ALGORITHM
SN - 978-989-8111-85-2
AU - Jou C.
PY - 2009
SP - 152
EP - 157
DO - 10.5220/0001986001520157