ter Stream and Turkish newspaper data. We work on
real-time data to achieve that our research can be used
by security analysts. Existing publications about real-
time cybersecurity event detection system generally
use English texts to analyze and detect the events.
We cannot find any research which use Turkish data
sources to detect cybersecurity events. Using Turk-
ish data sources for cybersecurity event detection is a
new topic for literature. We believe that this research
contributes to the literature by filling an uninvesti-
gated field. We proposed an automated software sys-
tem which works using different data sources, named
entities, text mining methods, and ”state of art” soft-
ware techniques. Then we analyze the results of our
software system. Even if our software system detects
few false-positive cybersecurity events, it was often
able to detect a useful cybersecurity event. For ex-
ample, our software system can detect cybersecurity
events such as WhatsApp Spyware, MuddyWater At-
tack, the Remote Patient Tracking System Applica-
tions vulnerability, Pirate Matryoshka Virus, Zombie
Cookies threat. We concluded that event detection
with using Turkish texts is applicable, and security
analysts can use such a system like our software sys-
tem as a helper tool.
5.2 Limitations and Future Work
Currently, our software system works on a local com-
puter. When we move the software to a server(i.e.
AWS), our software can work 7x24, which will be
useful for detection success. If our software can
work with bigger data, it will detect more events
with more accurate event detection. To increase the
streaming data, we are planning to add new Turk-
ish data sources from other websites like Eksisozluk,
Linkedin, Facebook, and so on. This improvement
will make our datasets an excellent resource for fu-
ture work. After these improvements, our datasets
can be useful not only for us but also the other re-
searchers work on cybersecurity, cognitive science
or computer science field. We shared our software
solution as an open source project via Github un-
der Apache-2.0 license and it can be reachable from
”https://github.com/ozzgural/MSThesis” link. We are
also planning to share our future works on there and
according to users feedback, we are planning to refine
our software tool. The developed scenario may be ap-
plied to the other languages with necessary modifica-
tions and this work is also in our future plans. More-
over, we do not handle the named entity recognition
ambiguities yet. We are planning to handle them in
the future.
REFERENCES
Alan, G. A. E. (2020). The importance of marketing public
relations for “new” consumers. New Communication
Approaches in the Digitalized World, page 157.
Alves, F., Bettini, A., Ferreira, P. M., and Bessani, A.
(2021). Processing tweets for cybersecurity threat
awareness. Information Systems, 95:101586.
Aslan, c. B., Sa
˘
glam, R. B., and Li, S. (2018). Automatic
detection of cyber security related accounts on online
social networks: Twitter as an example. In Proceed-
ings of the 9th International Conference on Social Me-
dia and Society, SMSociety ’18, page 236–240, New
York, NY, USA. Association for Computing Machin-
ery.
Duarte, F., Pereira, O., and Aguiar, R. (2018). Discovery of
newsworthy events in twitter. pages 244–252.
Eryi
˘
git, G. (2014). ITU Turkish NLP web service. In
Proceedings of the Demonstrations at the 14th Con-
ference of the European Chapter of the Association
for Computational Linguistics (EACL), Gothenburg,
Sweden. Association for Computational Linguistics.
Fabritius, M. (2017). How to motivate colouring app users.
Gaikwad, S. V., Chaugule, A., and Patil, P. (2014). Text
mining methods and techniques. International Jour-
nal of Computer Applications, 85(17).
Huberman, B., Romero, D., and Wu, F. (2009). Social net-
works that matter: Twitter under the microscope. First
Monday, 14.
Javed, A., Burnap, P., and Rana, O. (2019). Prediction
of drive-by download attacks on twitter. Information
Processing & Management, 56(3):1133 – 1145.
Khandpur, R. P., Ji, T., Jan, S., Wang, G., Lu, C.-T., and
Ramakrishnan, N. (2017). Crowdsourcing cybersecu-
rity: Cyber attack detection using social media. In
Proceedings of the 2017 ACM on Conference on In-
formation and Knowledge Management, CIKM ’17,
page 1049–1057, New York, NY, USA. Association
for Computing Machinery.
Kr
´
al, P. and Rajtmajer, V. (2017). Real-time data harvesting
method for czech twitter. pages 259–265.
Kwak, H., Lee, C., Park, H., and Moon, S. (2010). What is
twitter, a social network or a news media? In Proceed-
ings of the 19th International Conference on World
Wide Web, WWW ’10, page 591–600, New York, NY,
USA. Association for Computing Machinery.
Okay, A., Gole, P. A., and Okay, A. (2020). Turkish and
slovenian health ministries’ use of twitter: a compar-
ative analysis. Corporate Communications: An Inter-
national Journal.
Petersen, J. K. Handbook of surveillance technologies.
CRC Press,, Boca Raton, Fla., 3rd edition.
Phan, H. T., Tran, V. C., Nguyen, N. T., and Hwang,
D. (2020). Improving the performance of sentiment
analysis of tweets containing fuzzy sentiment using
the feature ensemble model. IEEE Access, 8:14630–
14641.
Rajaraman, A., Leskovec, J., and Ullman, J. (2014). Mining
of Massive Datasets.
Automatic Detection of Cyber Security Events from Turkish Twitter Stream and Newspaper Data
75