relative improvement in F-score and accomplish
variable selection. All in all, this study combines for
the first time the real world epidemiological data, the
advanced EA-based optimization and database pre-
processing using Cook’s distance.
Currently, we have predicted CVDs for the next
7-9 years with one time point predictors, whereas, in
the future we plan to test the presented approach in
multiple time point predictor data (baseline, 4-year,
11-year examinations) and make predictions for
longer periods.
Moreover, at the next step, we should take
advantage of the MOEA distinctive feature to return
a number of non-dominated points, which might be
involved in the ensemble of SVM models.
REFERENCES
Bellazzi, R., Zupan, B., 2008. Predictive data mining in
clinical medicine: Current issues and guidelines.
International Journal of Medical Informatics, vol. 77,
issue 2, pp. 81-97.
Beretta, L., Santaniello, A., 2016. Nearest neighbour
imputation algorithms: a critical evaluation. BMC Med
Inform Decis Mak, 16 Suppl 3: 74. doi:
10.1186/s12911-016-0318-z.
Boser, B.E., Guyon, I.M., Vapnik, V.N, 1992. A training
algorithm for optimal margin classifiers. Proceedings
of the fifth annual workshop on Computational
learning theory – COLT '92, pp. 144-52.
Brameier, M., Banzhaf, W., 2001. A Comparison of
Linear Genetic Programming and Neural Networks in
Medical Data Mining. IEEE Transactions on Evolu-
tionary Computation IEEE, vol. 5, no. 1, pp. 1-10.
Brester, Ch., Kauhanen, J., Tuomainen, T. P., Semenkin,
E., Kolehmainen, M., 2016. Comparison of Two-
Criterion Evolutionary Filtering Techniques in
Cardiovascular Predictive Modelling. Proceedings of
the 13th International Conference on Informatics in
Control, Automation and Robotics (ICINCO), vol. 1,
pp. 140-145.
Brester, Ch., Ryzhikov, I., Semenkin, E., Kolehmainen,
M., 2018. On Island Model Performance for
Cooperative Real-Valued Multi-Objective Genetic
Algorithms. Advances in Swarm and Computational
Intelligence. In press
Cheng, T-H., Wei, Ch-P., Tseng V.S., 2006. Feature
Selection for Medical Data Mining: Comparisons of
Expert Judgment and Automatic Approaches. IEEE
proc of 19th IEEE Symposium on Computer-Based
Medical Systems (CBMS‘06), pp. 165-170.
Cho, M. Y., Hoang,
T. T., 2017. Feature Selection and
Parameters Optimization of SVM Using Particle
Swarm Optimization for Fault Classification in Power
Distribution Systems. Comput Intell Neurosci.
DOI: 10.1155/2017/4135465.
Cook, R.D., 1977. Deletion of influential observation in
linear regression. Techno-metrics, 19, pp. 15-18.
Ghaddar, B., Naoum-Sawaya, J., 2018. High dimensional
data classification and feature selection using support
vector machines. European Journal of Operational
Research, vol. 265, issue 3, pp. 993-1004.
Goutte, C., Gaussier, E., 2005. A probabilistic
interpretation of precision, recall and F-score, with
implication for evaluation. ECIR'05 Proceedings of
the 27th European conference on Advances in
Information Retrieval Research, pp. 345–359.
Hall, M., Frank, E., Holmes, G., Pfahringer, B.,
Reutemann, P., Witten, I. H., 2009. The WEKA Data
Mining Software: An Update. SIGKDD Explorations,
Volume 11, Issue 1.
Kurl, S, Jae, SY, Kauhanen, J, Ronkainen, K, Laukkanen,
JA, 2015. Impaired pulmonary function is a risk
predictor for sudden cardiac death in men. Ann Med,
47(5), pp. 381–385.
Liao P., Zhang X., Li, K., 2015. Parameter Optimization
for Support Vector Machine Based on Nested Genetic
Algorithms. Journal of Automation and Control
Engineering, vol. 3, no. 6, pp. 507-511.
Liu, M., Zou, X., Chen, Y., Wu, Z., 2009. Performance
assessment of DMOEA-DD with CEC 2009 MOEA
competition test instances. 2009 IEEE Congress on
Evolutionary Computation. DOI: 10.1109/CEC.2009.
4983309.
Platt, J., 1999. Fast Training of Support Vector Machines
using Sequential Minimal Optimization. Advances in
Kernel Methods, pp. 185-208.
Ren, Y., Bai, G., 2010. Determination of Optimal SVM
Parameters by Using GA/PSO. Journal of computers,
vol. 5, no. 8, pp. 1160-1168.
Salonen, J. T., 1988. Is there a continuing need for
longitudinal epidemiologic research? The Kuopio
Ischaemic Heart Disease Risk Factor Study. Ann Clin
Res, 20(1-2), pp. 46-50.
Syarif, I., Prugel-Bennett, A., Wills, G., 2016. SVM
Parameter Optimization Using Grid Search and
Genetic Algorithm to Improve Classification Perfor-
mance. Telkomnika, vol. 14, no. 4, pp. 1502- 1509.
Tolmunen, T, Lehto, S. M., Julkunen, J., Hintikka, J.,
Kauhanen, J., 2014. Trait anxiety and somatic
concerns associate with increased mortality risk: a 23-
year follow-up in aging men. Ann Epidemiol, 24(6),
pp. 463-468.
Tu, M. C., Shin, D., Shin, D. K., 2009. A Comparative
Study of Medical Data Classification Methods Based
on Decision Tree and Bagging Algorithms. IEEE proc
of Eighth International Conference on Dependable
Autonomic and Secure Computing, pp. 183-187.
Virtanen, J. K, Mursu, J, Virtanen, H. E., Fogelholm, M.,
Salonen, J. T., Koskinen, T. T., Voutilainen, S.,
Tuomainen, T. P., 2016. Associations of egg and
cholesterol intakes with carotid intima-media
thickness and risk of incident coronary artery disease
according to apolipoprotein E phenotype in men: the
Kuopio Ischemic Heart Disease Risk Factor Study. Am
J Clin Nutr, 103(3), pp. 895-901.
World Health Organization: fact sheet ‘Cardiovascular