Semantic Enrichment of Relevant Feature Selection Methods for Data Mining in Oncology

Adriana Da Silva Jacinto, Ricardo Da Silva Santos, José Maria Parente De Oliveira


This project presents a proposal of capturing of the semantic importance of each feature by computational manner. The proposal enriches the traditional methods of feature selection by using of Natural Language Processing, the NCI ontology, WordNet and medical documents. A prototype of this approach was implemented and tested with five data sets related to cancer patients. The results show that the use of semantic improves the pre – processing by selecting of the most relevant semantic features.


