Phishing Email Detection based on Named Entity Recognition
Vít Listík, Šimon Let, Jan Šedivý, Václav Hlaváč
2019
Abstract
This work evaluates two phishing detection algorithms, which are both based on named entity recognition (NER), on live traffic of Email.cz. The first algorithm was proposed in (Ramanathan and Wechsler, 2013). It is using NER and latent Dirichlet allocation (LDA) as feature extractors for random forest classifier. This algorithm achieved 100% F-measure on the publicly available testing dataset. We are using this algorithm as the baseline for our newly proposed solution. The newly proposed solution is using companies detected by the NER and it is comparing URLs present in the email content to the company URL profile (based on history). The company URL profile contains domains which are frequently mentioned in legitimate traffic from that domain. The advantage of the proposed solution is that it does not need phishing dataset, which is hard to get, especially for languages other than English. Our solution outperforms the baseline solution. Both solutions are able to detect previously undetected phishing attacks. Combination of the solutions achieves 100 % F-measure on the portion of live traffic.
DownloadPaper Citation
in Harvard Style
Listík V., Let Š., Šedivý J. and Hlaváč V. (2019). Phishing Email Detection based on Named Entity Recognition.In Proceedings of the 5th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP, ISBN 978-989-758-359-9, pages 252-256. DOI: 10.5220/0007314202520256
in Bibtex Style
@conference{icissp19,
author={Vít Listík and Šimon Let and Jan Šedivý and Václav Hlaváč},
title={Phishing Email Detection based on Named Entity Recognition},
booktitle={Proceedings of the 5th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP,},
year={2019},
pages={252-256},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007314202520256},
isbn={978-989-758-359-9},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 5th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP,
TI - Phishing Email Detection based on Named Entity Recognition
SN - 978-989-758-359-9
AU - Listík V.
AU - Let Š.
AU - Šedivý J.
AU - Hlaváč V.
PY - 2019
SP - 252
EP - 256
DO - 10.5220/0007314202520256