
Future work will focus on analyzing the clus-
ters generated by this model and documenting the
insights, following the expansion of the Spamley
dataset to improve the generalizability and accuracy
of the models. Additionally, efforts will be directed
toward exploring further variables that can be incor-
porated into the model to refine user profiling.
REFERENCES
Abou El-Naga, A. H., Sayed, S., Salah, A., and Mohsen,
H. (2022). Consensus nature inspired clustering
of single-cell rna-sequencing data. IEEE Access,
10:98079–98094.
Aggarwal, C. C. and Aggarwal, C. C. (2017). An introduc-
tion to outlier analysis. Springer.
Albladi, S. M. and Weir, G. R. (2018). User characteristics
that influence judgment of social engineering attacks
in social networks. Human-centric Computing and In-
formation Sciences, 8:1–24.
Allodi, L., Chotza, T., Panina, E., and Zannone, N.
(2019). The need for new antiphishing measures
against spear-phishing attacks. IEEE Security & Pri-
vacy, 18(2):23–34.
Calinski, T. and Harabasz, J. (1974). A dendrite method
for cluster analysis. Communications in Statistics,
3(1):1–27.
Chandola, V., Banerjee, A., and Kumar, V. (2009).
Anomaly detection: A survey. ACM Computing Sur-
veys (CSUR), 41(3):1–58.
Cialdini, R. B. and Cialdini, R. B. (2007). Influence: The
psychology of persuasion, volume 55. Collins New
York.
Davies, D. L. and Bouldin, D. W. (1979). A cluster separa-
tion measure. IEEE transactions on pattern analysis
and machine intelligence, (2):224–227.
Dhamija, R., Tygar, J. D., and Hearst, M. (2006). Why
phishing works. In Proceedings of the SIGCHI confer-
ence on Human Factors in computing systems, pages
581–590.
Gallo, L., Gentile, D., Ruggiero, S., Botta, A., and Ventre,
G. (2024). The human factor in phishing: Collect-
ing and analyzing user behavior when reading emails.
Computers & Security, 139:103671.
GDPR, G. D. P. R. (2016). General data protection reg-
ulation. Regulation (EU) 2016/679 of the European
Parliament and of the Council of 27 April 2016 on the
protection of natural persons with regard to the pro-
cessing of personal data and on the free movement of
such data, and repealing Directive 95/46/EC.
Han, J., Pei, J., and Tong, H. (2022). Data mining: concepts
and techniques. Morgan kaufmann.
Kim, S.-H. and Cho, S.-B. (2024). Detecting phishing urls
based on a deep learning approach to prevent cyber-
attacks. Applied Sciences, 14(22):10086.
Kohonen, T. (1982). Self-organized formation of topolog-
ically correct feature maps. Biological cybernetics,
43(1):59–69.
Kotsiantis, S. B., Kanellopoulos, D., and Pintelas, P. E.
(2006). Data preprocessing for supervised leaning.
International journal of computer science, 1(2):111–
117.
Lawson, P., Pearson, C. J., Crowson, A., and Mayhorn,
C. B. (2020). Email phishing and signal detection:
How persuasion principles and personality influence
response patterns and accuracy. Applied ergonomics,
86:103084.
Little, R. J. and Rubin, D. B. (2019). Statistical analysis
with missing data, volume 793. John Wiley & Sons.
Parrish Jr, J. L., Bailey, J. L., and Courtney, J. F. (2009).
A personality based model for determining suscepti-
bility to phishing attacks. Little Rock: University of
Arkansas, pages 285–296.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V.,
Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P.,
Weiss, R., Dubourg, V., et al. (2011). Scikit-learn:
Machine learning in python. the Journal of machine
Learning research, 12:2825–2830.
Powers, D. M. (2020). Evaluation: from precision, recall
and f-measure to roc, informedness, markedness and
correlation. arXiv preprint arXiv:2010.16061.
Rousseeuw, P. J. (1987). Silhouettes: a graphical aid to
the interpretation and validation of cluster analysis.
Journal of computational and applied mathematics,
20:53–65.
Sammut, C. and Webb, G. I. (2011). Encyclopedia of ma-
chine learning. Springer Science & Business Media.
Sun, Y. (2000). On quantization error of self-organizing
map network. Neurocomputing, 34(1-4):169–193.
Tornblad, M. K., Jones, K. S., Namin, A. S., and Choi,
J. (2021). Characteristics that predict phishing sus-
ceptibility: a review. In Proceedings of the Human
Factors and Ergonomics Society Annual Meeting, vol-
ume 65, pages 938–942. SAGE Publications Sage CA:
Los Angeles, CA.
Van Der Heijden, A. and Allodi, L. (2019). Cognitive triag-
ing of phishing attacks. In 28th USENIX Security Sym-
posium (USENIX Security 19), pages 1309–1326.
Vesanto, J. and Alhoniemi, E. (2000). Clustering of the self-
organizing map. IEEE Transactions on neural net-
works, 11(3):586–600.
Wang, J., Herath, T., Chen, R., Vishwanath, A., and Rao,
H. R. (2012). Research article phishing susceptibil-
ity: An investigation into the processing of a targeted
spear phishing email. IEEE transactions on profes-
sional communication, 55(4):345–362.
Yedidia, A. (2016). Against the f-score.
URL: https://adamyedidia. files. wordpress.
com/2014/11/fscore. pdf.
Enhanced Predictive Clustering of User Profiles: A Model for Classifying Individuals Based on Email Interaction and Behavioral Patterns
373