information will improve the accuracy by reflecting
more information on tweets posted in the early stage
of posting.
5.2 Dataset Discussion
We created a Japanese rumor dataset based on
previous studies (Liu, X., et al. 2015, Ma, J., et al.
2016) in this paper. However, there are some points
to be considered for this dataset as well. Unlike
Snopes.com, the FIJ data may make the problem of
multinomial classification more difficult because the
number of tweets that belong to the True Group is
small. In this regard, we thought of an approach to
increase the amount of data by creating pseudo
extended data using replies as the source tweets by
focusing on whether the replies affirm or deny the
postings as shown in Figure 4. However, this
approach has some issues, such as how to deal with
the case of Unclear tweets, and the fact that
denial/affirmation does not always correspond to
True/False.
Figure 4: An example of data extension.
In addition, it is necessary to increase the evaluation
score of the dataset Little when considering early
rumor detection close to the time of source tweet
posting. One solution for this problem is to increase
the number of feature types. For example, we believe
the emotional information of a tweet or the textual
information in a summary sentence can be extracted
as separate features.
6 CONCLUSION
In this study, we created a dataset of Japanese fact-
checked tweets on Twitter and performed binary
classification which classified a tweet as False or not,
and multinomial classification which classified a
tweet as True Rumor, Unclear Rumor, or False rumor.
Moreover, we examined and discussed Japanese
rumor detection. Future prospects are to apply this
method to other languages than Japanese and to
experiment and confirm the method’s accuracy.
Conversely, it is necessary to reconfirm the
evaluation score of the current model by comparing
the results with existing Rumor Detection models for
other languages. Also, one of our future goals is to
build a more practical fact-checking support system.
To do so, we need to improve the detection evaluation
score by building other models, putting in additional
features, and studying more effectively.
ACKNOWLEDGEMENTS
This work was supported by JSPS KAKENHI Grant
Numbers JP21H03496, JP22K12157.
REFERENCES
Islam, M. S., Sarkar, T., Khan, S. H., Kamal, A. H. M.,
Hasan, S. M., Kabir, A., ... and Seale, H. (2020).
COVID-19–related infodemic and its impact on public
health: A global social media analysis. The American
journal of tropical medicine and hygiene,
Liu, X., Nourbakhsh, A., Li, Q., Fang, R., Shah, S., (2015).
Real-time Rumor Debunking on Twitter, Proceedings
of the 24th ACM International on Conference on
Information and Knowledge Management, pp.1867-
1870
Ma, J., Gao, W., Mitra, P., Kwon, S., Jansen, B. J., Wong,
K. F., & Cha, M. (2016). Detecting rumors from
microblogs with recurrent neural networks.,
Proceedings of the 25th International Joint Conference
on Artificial Intelligence (IJCAI2016), pp.3818-3824.
Ma, J., Gao, W., Wong, K. F. (2018). Rumor detection on
twitter with tree-structured recursive neural networks.,
Association for Computational Linguistics, pp.1980-
1989.
Bian, T., Xiao, X., Xu, T., Zhao, P., Huang, W., Rong, Y.,
& Huang, J. (2020). Rumor detection on social media
with bi-directional graph convolutional networks.
In Proceedings of the AAAI conference on artificial
intelligence, pp. 549-556.
Li, J., Bao, P., Shen, H., & Li, X. (2021). MiSTR: A
Multiview Structural-Temporal Learning Framework
for Rumor Detection. IEEE Transactions on Big Data.
Lao, A., Shi, C., & Yang, Y. (2021). Rumor detection with
field of linear and non-linear propagation.
In Proceedings of the Web Conference 2021, pp. 3178-
3187.
Cheng, M., Li, Y., Nazarian, S., Bogdan P., (2021) From
rumor to genetic mutation detection with explanations:
a GAN approach. Nature Scientific Reports,
Ma, J., Li, J., Gao, W., Yang, Y., & Wong, K. F. (2021).
Improving Rumor Detection by Promoting Information
Campaigns with Transformer-based Generative
Adversarial Learning. IEEE Transactions on
Knowledge and Data Engineering.