A Study of Classification of Texts into Categories of Cybersecurity Incident and Attack with Topic Models
Masahiro Ishii, Satoshi Matsuura, Kento Mori, Masahiko Tomoishi, Yong Jin, Yoshiaki Kitaguchi
2020
Abstract
To improve and automate cybersecurity incident handling in security operations centers (SOCs) and computer emergency response teams (CERTs), security intelligences extracted from various internal and external sources, including incident response playbooks, incident reports in each SOCs and CERTs, the National Vulnerability Database, and social media, must be utilized. In this paper, we apply various topic models to classify text related to cybersecurity intelligence and incidents according to topics derived from incidents and cyber attacks. We analyze cybersecurity incident reports and related text in our CERT and security blog posts using naive latent Dirichlet allocation (LDA), seeded LDA, and labeled LDA topic models. Labeling text based on designated categories is difficult and time-consuming. Training the seeded model does not require text to be labeled; instead, seed words are given to allow the model to infer topic-word and document-topic distributions for the text. We show that a seeded topic model can be used to extract and classify intelligence in our CERT, and we infer text more precisely compared with a supervised topic model.
DownloadPaper Citation
in Harvard Style
Ishii M., Matsuura S., Mori K., Tomoishi M., Jin Y. and Kitaguchi Y. (2020). A Study of Classification of Texts into Categories of Cybersecurity Incident and Attack with Topic Models. In Proceedings of the 6th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP, ISBN 978-989-758-399-5, pages 639-646. DOI: 10.5220/0009099606390646
in Bibtex Style
@conference{icissp20,
author={Masahiro Ishii and Satoshi Matsuura and Kento Mori and Masahiko Tomoishi and Yong Jin and Yoshiaki Kitaguchi},
title={A Study of Classification of Texts into Categories of Cybersecurity Incident and Attack with Topic Models},
booktitle={Proceedings of the 6th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP,},
year={2020},
pages={639-646},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0009099606390646},
isbn={978-989-758-399-5},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 6th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP,
TI - A Study of Classification of Texts into Categories of Cybersecurity Incident and Attack with Topic Models
SN - 978-989-758-399-5
AU - Ishii M.
AU - Matsuura S.
AU - Mori K.
AU - Tomoishi M.
AU - Jin Y.
AU - Kitaguchi Y.
PY - 2020
SP - 639
EP - 646
DO - 10.5220/0009099606390646