6 CONCLUSIONS AND FUTURE
WORK
In this paper, a framework of heuristic ensemble
of filters (HEF) has been proposed to overcome
the weaknesses of single filters. It combines the
outputs from two types of filters, SF and RF, with
heuristic rules as consensus functions to improve the
consistency and effectiveness in feature selection.
The proposed HEF and HEF-R1 have been tested on
11 benchmark datasets with the number of features
varied from 17 to as many as 15,154. The statistical
analysis on the experimental results show that the
ensemble technique performed more consistently and
in some cases even more accurate than individual
filters. Specifically,
1. HEF-R1 performed best for NBC and KNN, while
HEF performed best when using the SVM clas-
sifier, which demonstrates that our proposed en-
semble is more reliable and consistent than using
single filters.
2. There is no single best approach for all the situa-
tions. In other words, the performance of the sin-
gle filters varies from dataset to dataset and also
was influenced by the type of models chosen as
classifier. Thus, one filter may perform well in a
given dataset for a particular classifier but perform
poorly when used on a different dataset or with a
different type of classifier.
3. Among the four filters we used in our heuristic
ensemble of filters, the subset filters (FCBF and
CSF) were more frequently better and less fre-
quently worse on average than the rank filters.
4. The experimental results show that the ensemble
technique performed better overall than any indi-
vidual filter in terms of reliability, consistency and
accuracy.
Future work may include additional experiments
measuring the stability of our approach, which would
represent an additional way to evaluate our results. In
addition, investigations could be conducted on differ-
ent numbers and types of filters. Finally, we plan to
use ensemble classification to overcome the differen-
tials between the individual classifiers.
REFERENCES
Aha, D. W., Kibler, D., and Albert, M. K. (1991).
Instance-based learning algorithms. Machine learn-
ing, 6(1):37–66.
Blum, A. L. and Langley, P. (1997). Selection of relevant
features and examples in machine learning. Artificial
intelligence, 97(1):245–271.
Gutlein, M., Frank, E., Hall, M., and Karwath, A. (2009).
Large-scale attribute selection using wrappers. In
Computational Intelligence and Data Mining, 2009.
CIDM’09. IEEE Symposium on, pages 332–339.
IEEE.
Hall, M. A. (1999). Correlation-based feature selection
for machine learning. PhD thesis, The University of
Waikato, https://www.lri.fr.
John, G. H. and Langley, P. (1995). Estimating continuous
distributions in bayesian classifiers. In Proceedings
of the Eleventh conference on Uncertainty in artificial
intelligence, pages 338–345. Morgan Kaufmann Pub-
lishers Inc.
Kira, K. and Rendell, L. A. (1992). The feature selection
problem: Traditional methods and a new algorithm. In
Proceedings of the National Conference on Artificial
Intelligence, pages 129–129. John Wiley & Sons Ltd.
Kohavi, R. and John, G. H. (1997). Wrappers for feature
subset selection. Artificial intelligence, 97(1):273–
324.
Kononenko, I. (1994). Estimating attributes: analysis and
extensions of relief. In Machine Learning: ECML-94,
pages 171–182. Springer.
Moore, J. H. and White, B. C. (2007). Tuning relieff for
genome-wide genetic analysis. In Evolutionary com-
putation, machine learning and data mining in bioin-
formatics, pages 166–175. Springer.
Olsson, J. and Oard, D. W. (2006). Combining feature se-
lectors for text classification. In Proceedings of the
15th ACM international conference on Information
and knowledge management, pages 798–799. ACM.
Platt, J. C. (1999). 12 fast training of support vector ma-
chines using sequential minimal optimization.
Quinlan, J. R. (1993). C4. 5: programs for machine learn-
ing, volume 1. Morgan kaufmann.
Robnik-
ˇ
Sikonja, M. and Kononenko, I. (2003). Theoretical
and empirical analysis of relieff and rrelieff. Machine
learning, 53(1-2):23–69.
Saeys, Y., Abeel, T., and Van de Peer, Y. (2008). Robust fea-
ture selection using ensemble feature selection tech-
niques. In Machine Learning and Knowledge Discov-
ery in Databases, pages 313–325. Springer.
Saeys, Y., Inza, I., and Larra
˜
naga, P. (2007). A review of
feature selection techniques in bioinformatics. Bioin-
formatics, 23(19):2507–2517.
Sun, X., Liu, Y., Li, J., Zhu, J., Chen, H., and Liu, X.
(2012). Feature evaluation and selection with cooper-
ative game theory. Pattern Recognition, 45(8):2992–
3002.
Wang, H., Khoshgoftaar, T., and Gao, K. (2010a). En-
semble feature selection technique for software qual-
ity classification. In Proceedings of the 22nd In-
ternational Conference on Software Engineering and
Knowledge Engineering, pages 215–220.
Wang, H., Khoshgoftaar, T. M., and Napolitano, A.
(2010b). A comparative study of ensemble feature se-
lection techniques for software defect prediction. In
HeuristicEnsembleofFiltersforReliableFeatureSelection
181