Authors:
Takuro Hada
1
;
2
;
Yuichi Sei
1
;
3
;
Yasuyuki Tahara
1
and
Akihiko Ohsuga
2
Affiliations:
1
The University of Electro-Communications, Tokyo, Japan
;
2
First Organized Crime Countermeasures Division, Organized Crime Department, Criminal Investigation Bureau, National Police Agency, Japan
;
3
JST PRESTO, Saitama, Japan
Keyword(s):
Dark Jargon, Compound Word, Microblog, Twitter, Word Embedding, Word2Vec.
Abstract:
Recently, drug trafficking on microblogs has increased and become a social problem. While cyber patrols are being conducted to combat such crimes, those who post messages that lead to crimes continue to communicate skillfully using so-called “dark jargon,” a term that conceals their criminal intentions, to avoid using keywords (“drug,” ”marijuana,” etc.) of the target of monitoring. Evading detection by the eyes of monitoring, they continue to communicate with each other skillfully. Even if the monitors learn these dark jargons, they become obsolete over time as they become more common, and new dark jargons emerge. We have proposed a method for detecting dark jargons with criminal intent based on differences in the usage of words in posts and have achieved a certain level of success. In this study, by using similar words, we propose a method for detecting compound-type dark jargons that combines two or more words, which have been difficult to detect using existing methods. To confirm
the effectiveness of the proposed method, we conducted a detection experiment with compound words and a detection experiment with dark jargons. As a result, we confirmed that the proposed method enabled to detect compound-type dark jargons that could not be detected by existing methods.
(More)