Authors:
Juan L. Domínguez-Olmedo
1
;
Jacinto Mata
1
;
Victoria Pachón
1
and
Jose L. Lopez-Guerra
2
Affiliations:
1
University of Huelva, Spain
;
2
University Hospital Virgen del Rocío, Spain
Keyword(s):
Imbalanced Data Classification, Rules Discovery, Prostate Cancer.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Biomedical Engineering
;
Business Analytics
;
Cardiovascular Technologies
;
Computing and Telecommunications in Cardiology
;
Data Engineering
;
Data Mining
;
Databases and Information Systems Integration
;
Decision Support Systems
;
Decision Support Systems, Remote Data Analysis
;
Enterprise Information Systems
;
Health Engineering and Technology Applications
;
Health Information Systems
;
Knowledge-Based Systems
;
Pattern Recognition and Machine Learning
;
Sensor Networks
;
Signal Processing
;
Soft Computing
;
Symbolic Systems
Abstract:
This paper describes a rule-based classifier (DEQAR-C), which is set up by the combination of selected rules after a two-phase process. In the first phase, the rules are generated and sorted for each class, and then a selection is performed to obtain a final list of rules. A real imbalanced dataset regarding the toxicity during and after radiation therapy for prostate cancer has been employed in a comparison with other predictive methods (rule-based, artificial neural networks, trees, Bayesian and logistic regression). DEQAR-C produced excellent results in an evaluation regarding several performance measures (accuracy, Matthews correlation coefficient, sensitivity, specificity, precision, recall and F-measure) and by using cross-validation. Therefore, it was employed to obtain a predictive model using the full data. The resultant model is easily interpretable, combining three rules with two variables, and suggesting conditions that are mostly confirmed by the medical literature.