Authors:
Yu-N Cheah
;
Sakthiaseelan Karthigasoo
and
Selvakumar Manickam
Affiliation:
School of Computer Sciences, Universiti Sains Malaysia, Malaysia
Keyword(s):
Knowledge discovery, Clustering ensemble, Neural network ensemble, Discretization, Rough set analysis
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Artificial Intelligence and Decision Support Systems
;
Biomedical Engineering
;
Business Analytics
;
Data Engineering
;
Data Mining
;
Databases and Information Systems Integration
;
Datamining
;
Enterprise Information Systems
;
Group Decision Support Systems
;
Health Information Systems
;
Sensor Networks
;
Signal Processing
;
Soft Computing
;
Verification and Validation of Knowledge-Based Systems
Abstract:
Knowledge discovery presents itself as a very useful technique to transform enterprise data into actionable knowledge. However, their effectiveness is limited in view that it is difficult to develop a knowledge discovery pipeline that is suited for all types of datasets. Moreover, it is difficult to select the best possible algorithm for each stage of the pipeline. In this paper, we define (a) a novel clustering ensemble algorithm based on self-organizing maps to automate the annotation of un-annotated medical datasets; (b) a data discretization algorithm based on Boolean Reasoning to discretize continuous data values; (c) a rule filtering mechanism; and (d) to extend the regular knowledge discovery process by including a learning mechanism based on neural network ensembles to produce a neural knowledge base for decision support. We believe that this would result in a decision support system that is tolerant towards ambiguous queries, e.g. with incomplete inputs. We also believe that
the boosting and aggregating features of ensemble techniques would help to compensate for any shortcomings in some stages of the pipeline. Ultimately, we combine these efforts to produce an extended knowledge discovery pipeline.
(More)