then taken by a GNN as new input features and af-
ter message-passing, outperforms both traditional and
Machine Learning techniques. The best accuracy has
been achieved using a GCN combined with a Random
Forest classifier. Furthermore, our approach is easily
pluggable with any GNN architectures or other down-
stream classification methods. Hence, can be adjusted
and improved in future works.
For future works we will focus on the establish-
ment of a larger dataset, that contains more diverse ex-
amples. This dataset will be used in further research
to improve benchmarking capabilities for phishing
classification based on GNNs. We will also focus on
improving the accuracy of our approach via leverag-
ing edge features in the graph.
REFERENCES
Ahmadi, A. H. K. (2020). Memory-based graph networks.
PhD thesis, University of Toronto (Canada).
Benavides, E., Fuertes, W., Sanchez, S., and Sanchez, M.
(2020). Classification of phishing attack solutions by
employing deep learning techniques: A systematic lit-
erature review. Developments and advances in defense
and security, pages 51–64.
Chiang, W.-L., Liu, X., Si, S., Li, Y., Bengio, S., and Hsieh,
C.-J. (2019). Cluster-gcn: An efficient algorithm for
training deep and large graph convolutional networks.
In Proceedings of the 25th ACM SIGKDD Interna-
tional Conference on Knowledge Discovery & Data
Mining, pages 257–266.
Dunlop, M., Groat, S., and Shelly, D. (2010). Goldphish:
Using images for content-based phishing analysis. In
2010 Fifth international conference on internet moni-
toring and protection, pages 123–128. IEEE.
Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O., and
Dahl, G. E. (2017). Neural message passing for quan-
tum chemistry. In International conference on ma-
chine learning, pages 1263–1272. PMLR.
Hamilton, W., Ying, Z., and Leskovec, J. (2017). Inductive
representation learning on large graphs. Advances in
neural information processing systems, 30.
Jain, A. K. and Gupta, B. (2018). Phish-safe: Url features-
based phishing detection system using machine learn-
ing. In Cyber Security, pages 467–474. Springer.
Kipf, T. N. and Welling, M. (2016). Semi-supervised clas-
sification with graph convolutional networks. arXiv
preprint arXiv:1609.02907.
Le, H., Pham, Q., Sahoo, D., and Hoi, S. C. (2018). Url-
net: Learning a url representation with deep learn-
ing for malicious url detection. arXiv preprint
arXiv:1802.03162.
Opara, C., Wei, B., and Chen, Y. (2020). Htmlphish: en-
abling phishing web page detection by applying deep
learning techniques on html analysis. In 2020 Interna-
tional Joint Conference on Neural Networks (IJCNN),
pages 1–8. IEEE.
Ouyang, L. and Zhang, Y. (2021). Phishing web page de-
tection with html-level graph neural network. In 2021
IEEE 20th International Conference on Trust, Secu-
rity and Privacy in Computing and Communications
(TrustCom), pages 952–958. IEEE.
Prakash, P., Kumar, M., Kompella, R. R., and Gupta, M.
(2010). Phishnet: predictive blacklisting to detect
phishing attacks. In 2010 Proceedings IEEE INFO-
COM, pages 1–5. IEEE.
Sahoo, D., Liu, C., and Hoi, S. C. (2017). Malicious url
detection using machine learning: A survey. arXiv
preprint arXiv:1701.07179.
Sakurai, Y., Watanabe, T., Okuda, T., Akiyama, M., and
Mori, T. (2020). Discovering httpsified phishing web-
sites using the tls certificates footprints. In 2020 IEEE
European Symposium on Security and Privacy Work-
shops (EuroS&PW), pages 522–531. IEEE.
Saxe, J. and Berlin, K. (2017). expose: A character-level
convolutional neural network with embeddings for de-
tecting malicious urls, file paths and registry keys.
arXiv preprint arXiv:1702.08568.
Sonowal, G. and Kuppusamy, K. (2020). Phidma–a phish-
ing detection model with multi-filter approach. Jour-
nal of King Saud University-Computer and Informa-
tion Sciences, 32(1):99–112.
Tan, C. L., Chiew, K. L., Yong, K. S., Abdullah, J., Se-
bastian, Y., et al. (2020). A graph-theoretic approach
for the detection of phishing webpages. Computers &
Security, 95:101793.
Veli
ˇ
ckovi
´
c, P., Cucurull, G., Casanova, A., Romero, A., Lio,
P., and Bengio, Y. (2017). Graph attention networks.
arXiv preprint arXiv:1710.10903.
Wang, G., Liu, H., Becerra, S., Wang, K., Belongie, S. J.,
Shacham, H., and Savage, S. (2011). Verilogo: Proac-
tive phishing detection via logo recognition.
Xu, K., Hu, W., Leskovec, J., and Jegelka, S. (2018). How
powerful are graph neural networks? arXiv preprint
arXiv:1810.00826.
PhishGNN: A Phishing Website Detection Framework using Graph Neural Networks
435