Towards Automated Comprehensive Feature Engineering for Spam Detection

Fred Kiwanuka, Ja’far Alqatawna, Anang Amin, Sujni Paul, Hossam Faris

Abstract

Everyday billions of emails are passed or processed through online servers of which about 59% is spam according to a recent research. Spam emails have increasingly contained viruses or other harmful malware and are a security risk to computer systems. The importance of spam filtering and the security of computer systems has become more essential than ever. The rate of evolution of spam nowadays is so high and hence previously successful spam detection methods are failing to cope. In this paper, we propose a comprehensive and automated feature engineering framework for spam classification. The proposed framework enables first, the development of a large number of features from any email corpus, and second extracting automated features using feature transformation and aggregation primitives. We show that the performance of classification of spam improves between 2% to 28% for almost all conventional machine learning classifiers when using automated feature engineering. As a by product of our comprehensive automated feature engineering, we develop a Python-based open source tool, which incorporates the proposed framework.

Download


Paper Citation


in Harvard Style

Kiwanuka F., Alqatawna J., Amin A., Paul S. and Faris H. (2019). Towards Automated Comprehensive Feature Engineering for Spam Detection.In Proceedings of the 5th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP, ISBN 978-989-758-359-9, pages 429-437. DOI: 10.5220/0007393004290437


in Bibtex Style

@conference{icissp19,
author={Fred Kiwanuka and Ja’far Alqatawna and Anang Amin and Sujni Paul and Hossam Faris},
title={Towards Automated Comprehensive Feature Engineering for Spam Detection},
booktitle={Proceedings of the 5th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP,},
year={2019},
pages={429-437},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007393004290437},
isbn={978-989-758-359-9},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 5th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP,
TI - Towards Automated Comprehensive Feature Engineering for Spam Detection
SN - 978-989-758-359-9
AU - Kiwanuka F.
AU - Alqatawna J.
AU - Amin A.
AU - Paul S.
AU - Faris H.
PY - 2019
SP - 429
EP - 437
DO - 10.5220/0007393004290437