addressed. In addition, since our method strongly de-
pends on the document filter, the development of filter
for other datasets is necessary for its universal appli-
cation. We leave these studies as our future work.
ACKNOWLEDGEMENTS
This work was supported by JSPS Kakenhi Grant
Number 20H04295, 20K20406, and 20K20625.
REFERENCES
De Cao, N., Aziz, W., and Titov, I. (2019). Question an-
swering by reasoning across documents with graph
convolutional networks. In Proceedings of the 2019
Conference of the North American Chapter of the As-
sociation for Computational Linguistics: Human Lan-
guage Technologies, Volume 1 (Long and Short Pa-
pers), pages 2306–2317.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K.
(2019). Bert: Pre-training of deep bidirectional trans-
formers for language understanding. In NAACL.
Dhingra, B., Jin, Q., Yang, Z., Cohen, W. W., and Salakhut-
dinov, R. (2018). Neural models for reasoning over
multiple mentions using coreference. arXiv preprint
arXiv:1804.05922.
Ding, M., Zhou, C., Chen, Q., Yang, H., and Tang, J. (2019).
Cognitive graph for multi-hop reading comprehension
at scale. arXiv preprint arXiv:1905.05460.
Fang, Y., Sun, S., Gan, Z., Pillai, R., Wang, S., and Liu,
J. (2020). Hierarchical graph network for multi-hop
question answering. In Proceedings of the 2020 Con-
ference on Empirical Methods in Natural Language
Processing (EMNLP), pages 8823–8838.
Hermann, K. M., Kocisky, T., Grefenstette, E., Espe-
holt, L., Kay, W., Suleyman, M., and Blunsom, P.
(2015). Teaching machines to read and comprehend.
Advances in neural information processing systems,
28:1693–1701.
Houlsby, N., Giurgiu, A., Jastrzebski, S., Morrone, B.,
De Laroussilhe, Q., Gesmundo, A., Attariyan, M., and
Gelly, S. (2019). Parameter-efficient transfer learn-
ing for nlp. In International Conference on Machine
Learning, pages 2790–2799. PMLR.
Howard, J. and Ruder, S. (2018). Universal language
model fine-tuning for text classification. In Proceed-
ings of the 56th Annual Meeting of the Association for
Computational Linguistics (Volume 1: Long Papers),
pages 328–339, Melbourne, Australia. Association for
Computational Linguistics.
Huang, Y. and Yang, M. (2021). Breadth first reason-
ing graph for multi-hop question answering. In Pro-
ceedings of the 2021 Conference of the North Amer-
ican Chapter of the Association for Computational
Linguistics: Human Language Technologies, pages
5810–5821.
Joshi, M., Choi, E., Weld, D. S., and Zettlemoyer, L. (2017).
Triviaqa: A large scale distantly supervised challenge
dataset for reading comprehension. arXiv preprint
arXiv:1705.03551.
Khashabi, D., Chaturvedi, S., Roth, M., Upadhyay, S., and
Roth, D. (2018). Looking beyond the surface: A
challenge set for reading comprehension over multi-
ple sentences. In Proceedings of the 2018 Conference
of the North American Chapter of the Association for
Computational Linguistics: Human Language Tech-
nologies, Volume 1 (Long Papers), pages 252–262.
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma,
P., and Soricut, R. (2019). Albert: A lite bert for
self-supervised learning of language representations.
arXiv preprint arXiv:1909.11942.
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D.,
Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov,
V. (2019). Roberta: A robustly optimized bert pre-
training approach. arXiv preprint arXiv:1907.11692.
Maas, A., Daly, R. E., Pham, P. T., Huang, D., Ng, A. Y.,
and Potts, C. (2011). Learning word vectors for sen-
timent analysis. In Proceedings of the 49th annual
meeting of the association for computational linguis-
tics: Human language technologies, pages 142–150.
Nishida, K., Nishida, K., Nagata, M., Otsuka, A., Saito,
I., Asano, H., and Tomita, J. (2019). Answering
while summarizing: Multi-task learning for multi-
hop qa with evidence extraction. arXiv preprint
arXiv:1905.08511.
Peters, M. E., Ruder, S., and Smith, N. A. (2019). To tune
or not to tune? adapting pretrained representations to
diverse tasks. arXiv preprint arXiv:1903.05987.
Qiu, L., Xiao, Y., Qu, Y., Zhou, H., Li, L., Zhang, W.,
and Yu, Y. (2019). Dynamically fused graph network
for multi-hop reasoning. In Proceedings of the 57th
Annual Meeting of the Association for Computational
Linguistics, pages 6140–6150.
Rajpurkar, P., Jia, R., and Liang, P. (2018a). Know what
you don’t know: Unanswerable questions for squad.
arXiv preprint arXiv:1806.03822.
Rajpurkar, P., Jia, R., and Liang, P. (2018b). Know what you
don’t know: Unanswerable questions for SQuAD. In
Proceedings of the 56th Annual Meeting of the Associ-
ation for Computational Linguistics (Volume 2: Short
Papers), pages 784–789, Melbourne, Australia. Asso-
ciation for Computational Linguistics.
Shao, N., Cui, Y., Liu, T., Wang, S., and Hu, G. (2020). Is
graph structure necessary for multi-hop question an-
swering? In Proceedings of the 2020 Conference on
Empirical Methods in Natural Language Processing
(EMNLP), pages 7187–7192.
Song, L., Wang, Z., Yu, M., Zhang, Y., Florian, R., and
Gildea, D. (2018). Exploring graph-structured pas-
sage representation for multi-hop reading compre-
hension with graph neural networks. arXiv preprint
arXiv:1809.02040.
Sun, C., Qiu, X., Xu, Y., and Huang, X. (2019). How to
fine-tune bert for text classification? In China Na-
tional Conference on Chinese Computational Linguis-
tics, pages 194–206. Springer.
ICAART 2022 - 14th International Conference on Agents and Artificial Intelligence
252