Authors:
Christina Brester
1
;
Ivan Ryzhikov
1
;
Tomi-Pekka Tuomainen
2
;
Ari Voutilainen
2
;
Eugene Semenkin
3
and
Mikko Kolehmainen
4
Affiliations:
1
Department of Environmental and Biological Sciences, University of Eastern Finland, Kuopio, Finland, Institute of Computer Sciences and Telecommunication, Reshetnev Siberian State University of Science and Technology, Krasnoyarsk and Russia
;
2
Institute of Public Health and Clinical Nutrition, University of Eastern Finland, Kuopio and Finland
;
3
Institute of Computer Sciences and Telecommunication, Reshetnev Siberian State University of Science and Technology, Krasnoyarsk and Russia
;
4
Department of Environmental and Biological Sciences, University of Eastern Finland, Kuopio and Finland
Keyword(s):
Support Vector Machine, Cardiovascular Predictive Modeling, Multi-objective Evolutionary Algorithm, Parameter Optimization, Variable Selection.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Business Analytics
;
Cardiovascular Technologies
;
Computational Intelligence
;
Computing and Telecommunications in Cardiology
;
Data Engineering
;
Decision Support Systems
;
Decision Support Systems, Remote Data Analysis
;
Evolutionary Computation and Control
;
Evolutionary Computing
;
Genetic Algorithms
;
Health Engineering and Technology Applications
;
Informatics in Control, Automation and Robotics
;
Intelligent Control Systems and Optimization
;
Knowledge-Based Systems
;
Optimization Algorithms
;
Soft Computing
;
Symbolic Systems
Abstract:
We present a heuristic-based approach for Support Vector Machine (SVM) parameter optimization and variable selection using a real-valued cooperative Multi-Objective Evolutionary Algorithm (MOEA). Due to the possibility to optimize several criteria simultaneously, we aim to maximize the SVM performance as well as minimize the number of input variables. The second criterion is important especially if obtaining new observations for the training data is expensive. In the field of epidemiology, additional model inputs mean more clinical tests and higher costs. Moreover, variable selection should lead to performance improvement of the model used. Therefore, to train an accurate model predicting cardiovascular diseases, we decided to take a SVM model, optimize its meta and kernel function parameters on a true population cohort variable set. The proposed approach was tested on the Kuopio Ischemic Heart Disease database, which is one of the most extensively characterized epidemiological datab
ases. In our experiment, we made predictions on incidents of cardiovascular diseases with the prediction horizon of 7–9 years and found that use of MOEA improved model performance from 66.8% to 70.5% and reduced the number of inputs from 81 to about 58, as compared to the SVM model with default parameter values on the full set of variables.
(More)