Phishing Email Detection based on Named Entity Recognition

Vít Listík, Šimon Let, Jan Šedivý, Václav Hlaváč

Abstract

This work evaluates two phishing detection algorithms, which are both based on named entity recognition (NER), on live traffic of Email.cz. The first algorithm was proposed in (Ramanathan and Wechsler, 2013). It is using NER and latent Dirichlet allocation (LDA) as feature extractors for random forest classifier. This algorithm achieved 100% F-measure on the publicly available testing dataset. We are using this algorithm as the baseline for our newly proposed solution. The newly proposed solution is using companies detected by the NER and it is comparing URLs present in the email content to the company URL profile (based on history). The company URL profile contains domains which are frequently mentioned in legitimate traffic from that domain. The advantage of the proposed solution is that it does not need phishing dataset, which is hard to get, especially for languages other than English. Our solution outperforms the baseline solution. Both solutions are able to detect previously undetected phishing attacks. Combination of the solutions achieves 100 % F-measure on the portion of live traffic.

Download


Paper Citation


in Harvard Style

Listík V., Let Š., Šedivý J. and Hlaváč V. (2019). Phishing Email Detection based on Named Entity Recognition.In Proceedings of the 5th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP, ISBN 978-989-758-359-9, pages 252-256. DOI: 10.5220/0007314202520256


in Bibtex Style

@conference{icissp19,
author={Vít Listík and Šimon Let and Jan Šedivý and Václav Hlaváč},
title={Phishing Email Detection based on Named Entity Recognition},
booktitle={Proceedings of the 5th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP,},
year={2019},
pages={252-256},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0007314202520256},
isbn={978-989-758-359-9},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 5th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP,
TI - Phishing Email Detection based on Named Entity Recognition
SN - 978-989-758-359-9
AU - Listík V.
AU - Let Š.
AU - Šedivý J.
AU - Hlaváč V.
PY - 2019
SP - 252
EP - 256
DO - 10.5220/0007314202520256