affected as well; they are easily lured into clicking
malicious links. The current approach proposed a
novel PSO ensemble to improve and advance research
done in the maliciousness detection problem of web
links. Our ensemble is heterogeneous, combining
multiple ML algorithms that proved to be efficient in-
dividually. The combining mechanism uses weights,
which are generated with the PSO algorithm on a vali-
dation set. The experiments follow a calibration stage,
and they are tested on two different datasets. The re-
sults achieved on the first dataset (97.05% accuracy)
improve the previous solution. In contrast, for the
second dataset, our approach is not so accurate com-
pared to the solution found in literature. There are
still things to improve, but, to our considerations, we
manage to propose an innovative empirical approach
on malicious web links detection.
Considering future work, we propose to develop
a real-time reporting framework that aims to collect
data associated with a link, including time-dependent
features that can improve detection algorithms and re-
duce their complexity. Time-dependent features in-
clude network information if the web link was in-
cluded in blacklists or whitelists. The framework is
supposed to get a snapshot with the current link data.
Moreover, we propose to test this PSO-ensemble
against larger datasets to further prove its efficiency
and robustness.
REFERENCES
Aburomman, A. A. and Reaz, M. B. I. (2016). A novel
svm-knn-pso ensemble method for intrusion detection
system. Applied Soft Computing, 38:360–372.
Ali, W. and Malebary, S. (2020). Particle swarm
optimization-based feature weighting for improving
intelligent phishing website detection. IEEE Access,
8:116766–116780.
Alsaedi, M., Ghaleb, F. A., Saeed, F., Ahmad, J., and Alasli,
M. (2022). Cyber threat intelligence-based malicious
url detection model using ensemble learning. Sensors,
22(9):3373.
GTKlondike (2019). Machine-learning-for-security-
analysts. Dataset website.
Gupta, S. and Singhal, A. (2017). Phishing url detection by
using artificial neural network with pso. In 2017 2nd
International Conference on Telecommunication and
Networks (TEL-NET), pages 1–6. IEEE.
IT, A. (2023).
Joerg, S. (2017). Using-machine-learning-to-detect-
malicious-urls. faizan dataset website.
Kennedy, J. and Eberhart, R. (1995). Particle swarm opti-
mization. In Proceedings of ICNN’95-international
conference on neural networks, volume 4, pages
1942–1948. IEEE.
Lee, O. V., Heryanto, A., Ab Razak, M. F., Raffei, A. F. M.,
Phon, D. N. E., Kasim, S., and Sutikno, T. (2020).
A malicious urls detection system using optimization
and machine learning classifiers. Indonesian Jour-
nal of Electrical Engineering and Computer Science,
17(3):1210–1214.
Lester, J. V. (2017). Welcome to pyswarms’s documenta-
tion!
malwaredomainlist (2010). Malware domain list. malware-
domainlist.
Mamun, M. S. I., Rathore, M. A., Lashkari, A. H.,
Stakhanova, N., and Ghorbani, A. A. (2016). Detect-
ing malicious urls using lexical analysis. In Network
and System Security: 10th International Conference,
NSS 2016, Taipei, Taiwan, September 28-30, 2016,
Proceedings 10, pages 467–482. Springer.
Marchal, S., Franc¸ois, J., State, R., and Engel, T. (2014).
Phishstorm: Detecting phishing with streaming ana-
lytics. IEEE Transactions on Network and Service
Management, 11(4):458–471.
Pakhare, P. S., Krishnan, S., and Charniya, N. N. (2021).
Malicious url detection using machine learning and
ensemble modeling. In Computer Networks, Big Data
and IoT, pages 839–850. Springer, Singapore.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V.,
Thirion, B., Grisel, O., Blondel, M., Prettenhofer,
P., Weiss, R., Dubourg, V., Vanderplas, J., Passos,
A., Cournapeau, D., Brucher, M., Perrot, M., and
Duchesnay, E. (2011). Scikit-learn: Machine learning
in Python. Journal of Machine Learning Research,
12:2825–2830.
PhishTank (2023). PhishTank - Out of the Net, into the Tank
- Developer Information. PhishTank website.
Sajedi, H. (2019). An ensemble algorithm for discovery of
malicious web pages. International Journal of Infor-
mation and Computer Security, 11(3):203–213.
Saxe, J., Harang, R., Wild, C., and Sanders, H. (2018). A
deep learning approach to fast, format-agnostic detec-
tion of malicious web content. In 2018 IEEE Security
and Privacy Workshops (SPW), pages 8–14, San Fran-
cisco, CA, USA. IEEE, IEEE.
Shetty, U., Patil, A., and Mohana, M. (2023). Malicious
url detection and classification analysis using machine
learning models. In 2023 International Conference
on Intelligent Data Communication Technologies and
Internet of Things (IDCIoT), pages 470–476. IEEE.
Siddhartha, M. (2021). Malicious urls dataset. Kaggle -
Malicious URLs dataset.
Subasi, A., Balfaqih, M., Balfagih, Z., and Alfawwaz, K.
(2021). A comparative evaluation of ensemble classi-
fiers for malicious webpage detection. Procedia Com-
puter Science, 194:272–279.
Zhang, L. and Yan, Q. (2023). Detect malicious websites by
building a neural network to capture global and local
features of websites. Research Square.
ICAART 2024 - 16th International Conference on Agents and Artificial Intelligence
664