Evolutionary Symbiotic Feature Selection for Email Spam Detection
Paulo Cortez, Rui Vaz, Miguel Rocha, Miguel Rio, Pedro Sousa
2012
Abstract
This work presents a symbiotic filtering approach enabling the exchange of relevant word features among different users in order to improve local anti-spam filters. The local spam filtering is based on a Content-Based Filtering strategy, where word frequencies are fed into a Naive Bayes learner. Several Evolutionary Algorithms are explored for feature selection, including the proposed symbiotic exchange of the most relevant features among different users. The experiments were conducted using a novel corpus based on the well known Enron datasets mixed with recent spam. The obtained results show that the symbiotic approach is competitive.
References
- De Jong, K. (2006). Evolutionary computation: a Unified Approach. The MIT Press.
- Dudley, J., Barone, L., and While, L. (2008). Multiobjective spam filtering using an evolutionary algorithm, pages 123-130. IEEE.
- Evangelista, P., Maia, P., and Rocha, M. (2009). Implementing metaheuristic optimization algorithms with jecoli. In Intelligent Systems Design and Applications, 2009. ISDA'09. Ninth International Conference on, pages 505-510. IEEE.
- Fawcett, T. (2006). An introduction to roc analysis. Pattern Recognition Letters, 27:861-874.
- Flexer, A. (1996). Statistical Evaluation of Neural Networks Experiments: Minimum Requirements and Current Practice. In Proc. of the 13th European Meeting on Cybernetics and Systems Research, volume 2, pages 1005-1008, Vienna, Austria.
- Garriss, S., Kaminsky, M., Freedman, M., Karp, B., Mazières, D., and Yu, H. (2006). RE: reliable email. In Proc. of the 3rd conference on Networked Systems Design and Implementation (NSDI), pages 297-310, San Jose, CA. USENIX Association Berkeley, USA.
- Gray, A. and Haahr, M. (2004). Personalised, Collaborative Spam Filtering. In 1st Conference on E-Mail and AntiSpam CEAS.
- Guyon, I. and Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3:1157-1182.
- Lopes, C., Cortez, P., Sousa, P., Rocha, M., and Rio, M. (2011). Symbiotic filtering for spam email detection. Expert Systems with Applications, 38(8):9365-9372.
- Lopez-Herrera, A., Herrera-Viedma, E., and Herrera, F. (2008). A multiobjective evolutionary algorithm for spam e-mail filtering. In Intelligent System and Knowledge Engineering, 2008. ISKE 2008. 3rd International Conference on, volume 1, pages 366 -371.
- Méndez, J., Cid, I., Glez-Pen˜a, D., Rocha, M., and FdezRiverola, F. (2008). A Comparative Impact Study of Attribute Selection Techniques on Naive Bayes Spam Filters. In Springer, editor, 8th Industrial Conference on Data Mining, volume LNAI 5077, pages 213-227.
- Metsis, V., Androutsopoulos, I., and Paliouras, G. (2006). Spam filtering with naive bayes - which naive bayes? In Third Conference on Email and AntiSpam CEAS, pages 125-134. Citeseer.
- Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., and Euler, T. (2006). Yale: Rapid prototyping for complex data mining tasks. In Proc. of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 935-940. ACM.
- Radcliffe, N. (1993). Genetic set recombination. Foundations of Genetic Algorithms, 2:203-219.
- Zhang, Y., Li, H., Niranjan, M., and Rockett, P. (2008). Applying cost-sensitive multiobjective genetic programming to feature extraction for spam e-mail filtering. In Proc. of the 11th European conference on Genetic programming, pages 325-336. Springer-Verlag.
Paper Citation
in Harvard Style
Cortez P., Vaz R., Rocha M., Rio M. and Sousa P. (2012). Evolutionary Symbiotic Feature Selection for Email Spam Detection . In Proceedings of the 9th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO, ISBN 978-989-8565-21-1, pages 159-164. DOI: 10.5220/0004010201590164
in Bibtex Style
@conference{icinco12,
author={Paulo Cortez and Rui Vaz and Miguel Rocha and Miguel Rio and Pedro Sousa},
title={Evolutionary Symbiotic Feature Selection for Email Spam Detection},
booktitle={Proceedings of the 9th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,},
year={2012},
pages={159-164},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004010201590164},
isbn={978-989-8565-21-1},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 9th International Conference on Informatics in Control, Automation and Robotics - Volume 1: ICINCO,
TI - Evolutionary Symbiotic Feature Selection for Email Spam Detection
SN - 978-989-8565-21-1
AU - Cortez P.
AU - Vaz R.
AU - Rocha M.
AU - Rio M.
AU - Sousa P.
PY - 2012
SP - 159
EP - 164
DO - 10.5220/0004010201590164