Table 2: Comparisons with other methods for time series based on MSE.
IceTargets merged with noise IceTargets merged with cos QuarterlyGrossFarmProduct
Our approach 0,62 0,0170 0,007
Single Neural Network 0,93 0,5144 0,0211
Increase efficiency 1,5 times 30,25 times 3,014 times
ties of each subsequence and take advantage of both
methods. To show that it is possible we merged
two series with completely different prediction prop-
erties into one time series. We choose simple series
which grows linearly according to time and statistic
data IceTargets which expected value do not seems
to change in time. We merged them with the pattern:
X = (1, 2, IceTargets(1), IceTargets(2), 3, 4,
IceTargets(3), IceTargets(4), 5, 6, IceTargets(5),
IceTargets(6) ...)
Pattern is described with a vector
S = (110011001100110011001100...).
We use our approach which splits time series into two
subsequences (Please see table 1). To predict X1 we
use linear regression and to predict X0 we use neural
network. Thanks to that we use advanteges of both
methods and get MSE=0.0101. In case of using sin-
gle neural network method we get MSE = 2535.45
and when using only single linear regression MSE =
30.35 (refer to Table 3). Our approach provides the
prediction error over 250000 times smaller then us-
ing only neural network and 3000 times smaller then
using only linear regression.
Table 3: Comparison of MSE calculated with different
methods for the time series created by merging linear func-
tion and IceTarget.
Method Neural Linear Our
Network regression approach
MSE 2535.45 30.35 0,0101
6 CONCLUSIONS
In presented work, we proposed a novel method for
time series forecasting. Our approach is based on
splitting of the series into a subsequence and its com-
plement what can result in much lower potential pre-
diction error. Moreover, it allows application of dif-
ferent prediction methods to both subsequences and
therefore to combine their benefits. The proposed ap-
proach is not associated with any specific time series
forecasting method and can be applied as a generic
solution in time series preprocessing. Moreover we
show that our approach allows to noise filtering. In
order to validate the efficiency of the introduced so-
lution we conducted series of experiments. Obtained
results proved that using our approach results in sig-
nificant improvement of accuracy. Moreover we have
proven that generated overhead asymptotically is log-
arithmic with respect to time series length. Low com-
putation overhead caused by our approach suggests
that it can be useful regardless of the time series
length. Moreover algorithm can be processed parallel
and therefore we can decrease time of computation by
implementing it on multiple processors.
Our solution opens up broad prospects of further
work. First of all our approach use strict partition-
ing clustering where every element belongs to exactly
one cluster. Future research may design and examine
our approach with overlapping clustering where sin-
gle element may belong to many clusters. Efficiency
of our approach with such modification should be in-
vestigated on real data. Another open question is in-
fluence of choice of maximal searched pattern period
and minimal acceptable subseries length into our ap-
proach prediction efficiency. One of future area of re-
search could be also design and implementation auto-
mated method of selecting different prediction meth-
ods to proposed subseries.
REFERENCES
Duda, R. and Hart, P. (1973). Pattern classification and
scene analysis. In John Wiley and Sons, NY, USA,
1973.
Estivill-Castro, V. (20 June 2002). Why so many clus-
tering algorithms a position paper. In ACM
SIGKDD Explorations Newsletter 4 (1): 6575.
doi:10.1145/568574.568575.
G. Karypis, E.-H. Han, V. K. (1999). Chameleon: hierarchi-
cal clustering using dynamic modeling. In Computer
6875.
http://lib.stat.cmu.edu/datasets/.
https://datamarket.com/data/set/22xn/quarterly-australian-
gross-farm-product-m-198990-prices-sep-59-mar
93#!ds=22xn&display=line.
Huanmei Wu, Betty Salzberg, G. C. S.-S. B. J.-H. S. D. K.
(2005). Subsequence matching on structured time se-
ries data. In SIGMOD.
J. Han, M. K. (2001). Data mining: Concepts and tech-
niques, morgan kaufmann. In San Francisco, 2001
pp. 346389.
J. Han, M. K. (2003). Application of neural networks to
an emerging financial market: forecasting and trading
the taiwan stock index. In Computers & Operations
Research 30, pp. 901-923.
Time Series Forecasting using Clustering with Periodic Pattern
91