performance between the results we obtained for
these values and those we reported in this paper for
which NrGen was set to 100. The reasons for the
overfitting of the solutions resulting from the FCM
clustering might be the nature of the application
itself (Muhammad Fuad, 2013)
6 CONCLUSIONS
The application of bio-inspired optimization
algorithms to data mining tasks is not trivial given
the complexity of these tasks and the stochastic
behavior of bio-inspired algorithms. In this paper we
applied three widely-used bio-inspired optimization
algorithms: differential evolution, genetic
algorithms, and particle swarm optimization, to the
task of fuzzy c-means clustering of time series data,
where the aforementioned optimizers were used to
obtain the optimal values of the weights assigned to
a combination of distance metrics that was used in
the FCM clustering of the time series. We showed in
the experiments we conducted how, while all the
optimizers managed to improve the performance on
the training datasets, the improvement dropped when
the optimized values of the weights were applied to
the testing datasets as a result of the overfitting
problem which appeared during the optimization
process.
REFERENCES
Bustos, B. and Skopal, T., 2006. Dynamic similarity
search in multi-metric spaces. Proceedings of the ACM
Multimedia, MIR Workshop. ACM Press, New York,
NY.
Esling, P., and Agon, C., 2012. Time-series data mining
.ACM Comput. Surv.
Feoktistov, V. , 2006. Differential evolution: in search of
solutions (Springer Optimization and Its
Applications). Secaucus, NJ, USA. Springer- Verlag
New York, Inc.
Krzysztof, J.C., Pedrycz, W., Swiniarski, R.W., Kurgan,
L.A., 2007. Data mining: a knowledge discovery
approach. Springer-Verlag New York, Inc. Secaucus,
New Jersey.
Haupt, R.L., Haupt, S. E., 2004. Practical genetic
algorithms with CD-ROM. Wiley-Interscience.
Keogh, E., Zhu, Q., Hu, B., Hao. Y., Xi, X., Wei, L. &
Ratanamahatana C. A.. 2011.: The UCR time series
classification/clustering homepage: www.cs.ucr.edu/
~eamonn/time_series_data/
Larose, D., 2005. Discovering knowledge in data: an
introduction to data mining, Wiley, Hoboken, NJ.
Liao, T., 2005. Clustering of time series data–a survey.
Pattern Recognition.
Maulik, U., Bandyopadhyay, S., Mukhopadhyay, A.,2011.
Multiobjective genetic algorithms for clustering:
applications in data mining and bioinformatics.
Springer.
Mitchell, M., 1996. An introduction to genetic algorithms,
MIT Press, Cambridge, MA.
Muhammad Fuad, M.M., 2014a. A synergy of artificial
bee colony and genetic algorithms to determine the
parameters of the Σ-Gram distance. In DEXA 2014,
Munich, Germany. Lecture Notes in Computer Science
Volume 8645, 2014, pp 147-154.
Muhammad Fuad, M.M. , 2012a. Differential evolution
versus genetic algorithms: towards symbolic aggregate
approximation of non-normalized time series. In
IDEAS’12 , Prague, Czech Republic, Published by
BytePress/ACM.
Muhammad Fuad, M.M., 2014b. One-step or two-step
optimization and the overfitting phenomenon: a case
study on time series classification. In ICAART 2014.
Muhammad Fuad, M.M., 2012b. Using differential
evolution to set weights to segments with different
information content in the piecewise aggregate
approximation. In KES 2012, San Sebastian, Spain,
(FAIA). IOS Press.
Muhammad Fuad, M.Mm 2013. When optimization is just
an illusion. In ADMA 2013, Part I. LNCS, vol. 8346,
pp. 121–132. Springer, Heidelberg(2013).
Witten, I.H., Frank, E., 2009. Data mining practical
machine learning tools and techniques, second
edition, Elsevier.
OntheApplicationofBio-InspiredOptimizationAlgorithmstoFuzzyC-MeansClusteringofTimeSeries
353