Time Series Forecasting using Clustering with Periodic Pattern

Jan Kostrzewa

2015

Abstract

Time series forecasting have attracted a great deal of attention from various research communities. One of the method which improves accuracy of forecasting is time series clustering. The contribution of this work is a new method of clustering which relies on finding periodic pattern by splitting the time series into two subsequences (clusters) with lower potential error of prediction then whole series. Having such subsequences we predict their values separately with methods customized to the specificities of the subsequences and then merge results according to the pattern and obtain prediction of original time series. In order to check efficiency of our approach we perform analysis of various artificial data sets. We also present a real data set for which application of our approach gives more then 300% improvement in accuracy of prediction. We show that in artificially created series we obtain even more pronounced accuracy improvement. Additionally our approach can be use to noise filtering. In our work we consider noise of a periodic repetitive pattern and we present simulation where we find correct series from data where 50% of elements is random noise.

References

  1. Duda, R. and Hart, P. (1973). Pattern classification and scene analysis. In John Wiley and Sons, NY, USA, 1973.
  2. Estivill-Castro, V. (20 June 2002). Why so many clustering algorithms a position paper. In ACM SIGKDD Explorations Newsletter 4 (1): 6575. doi:10.1145/568574.568575.
  3. G. Karypis, E.-H. Han, V. K. (1999). Chameleon: hierarchical clustering using dynamic modeling. In Computer 6875.
  4. https://datamarket.com/data/set/22xn/quarterly-australiangross-farm-product-m-198990-prices-sep-59-mar 93#!ds=22xn&display=line.
  5. Huanmei Wu, Betty Salzberg, G. C. S.-S. B. J.-H. S. D. K. (2005). Subsequence matching on structured time series data. In SIGMOD.
  6. J. Han, M. K. (2001). Data mining: Concepts and techniques, morgan kaufmann. In San Francisco, 2001 pp. 346389.
  7. J. Han, M. K. (2003). Application of neural networks to an emerging financial market: forecasting and trading the taiwan stock index. In Computers & Operations Research 30, pp. 901-923.
  8. Jessica Lin, Eamonn Keogh, L. W. S. L. (2007). Experiencing sax: a novel symbolic representation of time series. In Data Mining and Knowledge Discovery, Volume 15, Issue 2, pp 107-144.
  9. M. Ester, H.-P. Kriegel, J. S. X. X. (1996). A densitybased algorithm for discovering clusters in large spatial databases. In Proceedings of the 1996 International Conference on Knowledge Discovery and Data Mining (KDD96).
  10. MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations, in: L.m. lecam, j. neyman (eds.). In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281297.
  11. Marquardt, D. (June 1963). An algorithm for least-squares estimation of nonlinear parameters. In SIAM Journal on Applied Mathematics, Vol. 11, No. 2, pp. 431-441.
  12. Moon S, Q. H. (2012). Hybrid dimensionality reduction method based on support vector machine and independent component analysis. In IEEE Trans Neural Netw Learn Syst. 2012 May;23(5):749-61. doi: 10.1109/TNNLS.2012.2189581.
  13. P. Cheeseman, J. S. (1996). Sting: a statistical information grid approach to spatial data mining. In Bayesian classification (AutoClass): theory and results, in: U.M. Fayyard, G. Piatetsky-Shapiro, P. Smyth, R. Uthurusamy (Eds.), Advances in Knowledge Discovery and Data Mining, AAAI/MIT Press, Cambridge, MA.
  14. S. Makridakis, S. Wheelwright, R. H. (1997). Forecasting: Methods and applications. In Wiley.
  15. S. Uma, A. C. (Jan 2012). Pattern recognition using enhanced non-linear time-series models for predicting dynamic real-time decision making environments. In Int. J. Business Information Systems, Vol. 11, Issue 1, pp. 69-92.
  16. Song, H. J., S. Z. Q. and Miao, C. Y. M. (2010). Fuzzy cognitive map learning based on multi-objective particle swarm optimization. In IEEE Transactions on Fuzzy Volume 18 Issue 2 233-250. IEEE Press Piscataway.
  17. Tong, H. (1983). Threshold models in non-linear time series analysis. In Springer-Verlag.
  18. W. Wang, J. Yang, R. M. R. (1997). Sting: a statistical information grid approach to spatial data mining. In Proceedings of the 1997 International Conference on Very Large Data Base (VLDB97).
  19. Zhang, G. (2003). Time series forecasting using a hybrid arima and neural network model. In Neurocomputing 50 pages: 159-175.
  20. Zhang, G. (2007). A neural network ensemble method with jittered training data for time series forecasting. In Information Sciences 177 pages: 5329-5346.
Download


Paper Citation


in Harvard Style

Kostrzewa J. (2015). Time Series Forecasting using Clustering with Periodic Pattern . In Proceedings of the 7th International Joint Conference on Computational Intelligence - Volume 3: NCTA, (ECTA 2015) ISBN 978-989-758-157-1, pages 85-92. DOI: 10.5220/0005586900850092


in Bibtex Style

@conference{ncta15,
author={Jan Kostrzewa},
title={Time Series Forecasting using Clustering with Periodic Pattern},
booktitle={Proceedings of the 7th International Joint Conference on Computational Intelligence - Volume 3: NCTA, (ECTA 2015)},
year={2015},
pages={85-92},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005586900850092},
isbn={978-989-758-157-1},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 7th International Joint Conference on Computational Intelligence - Volume 3: NCTA, (ECTA 2015)
TI - Time Series Forecasting using Clustering with Periodic Pattern
SN - 978-989-758-157-1
AU - Kostrzewa J.
PY - 2015
SP - 85
EP - 92
DO - 10.5220/0005586900850092