9 CONCLUSION
In this work, we presented the more theoretically-
sound context-aware partition of Wikipedia article
sections for span-based question answering. We
showed favorable quality gains to context-preserving
over incidental context-unaware passages, even after
recasting the latter for robust sampling. Our empiri-
cal analysis of an information-seeking QA task sub-
stantiated our contention that a pivotal answerable
paragraph is distinctly identified among the remaining
section paragraphs predicted unanswerable. Using a
sustainable context-less QA dataset proved to incur an
inconsequential runtime cost of inline queries.
We look forward to the research community for
a broader acceptance of data split as a core retriever
functionality, and follow with the endorsement of nat-
ural context retrieval that deems essential for dialog
and multi-hop QA. We envision the incorporation of
neural text simplification to further improve the effi-
cacy of QA tasks.
ACKNOWLEDGMENTS
We would like to thank the anonymous reviewers for
their insightful suggestions and feedback.
REFERENCES
Asai, A., Hashimoto, K., Hajishirzi, H., Socher, R., and
Xiong, C. (2020). Learning to retrieve reasoning
paths over wikipedia graph for question answering.
In Learning Representations, (ICLR), Addis Ababa,
Ethiopia.
Chen, D., Fisch, A., Weston, J., and Bordes, A. (2017).
Reading Wikipedia to answer open-domain questions.
In Annual Meeting of the Association for Computa-
tional Linguistics (ACL), pages 1870–1879, Vancou-
ver, Canada.
Choi, E., He, H., Iyyer, M., Yatskar, M., Yih, W.-t., Choi, Y.,
Liang, P., and Zettlemoyer, L. (2018). QuAC: Ques-
tion answering in context. In Empirical Methods in
Natural Language Processing (EMNLP), pages 2174–
2184, Brussels, Belgium.
Clark, C. and Gardner, M. (2018). Simple and effective
multi-paragraph reading comprehension. In Annual
Meeting of the Association for Computational Lin-
guistics (ACL), pages 845–855, Melbourne, Australia.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K.
(2019). BERT: Pre-training of deep bidirectional
transformers for language understanding. In North
American Chapter of the Association for Compu-
tational Linguistics: Human Language Technolo-
gies (NAACL), pages 4171–4186, Minneapolis, Min-
nesota.
Guu, K., Lee, K., Tung, Z., Pasupat, P., and Chang, M.-W.
(2020). Realm: Retrieval-augmented language model
pre-training. In International Conference on Machine
Learning (ICML), Online.
Karpukhin, V., Oguz, B., Min, S., Lewis, P., Wu, L.,
Edunov, S., Chen, D., and Yih, W.-t. (2020). Dense
passage retrieval for open-domain question answer-
ing. In Conference on Empirical Methods in Natural
Language Processing (EMNLP), pages 6769–6781,
Online.
Kwiatkowski, T., Palomaki, J., Redfield, O., Collins, M.,
Parikh, A., Alberti, C., Epstein, D., Polosukhin, I., De-
vlin, J., Lee, K., Toutanova, K., Jones, L., Kelcey, M.,
Chang, M.-W., Dai, A. M., Uszkoreit, J., Le, Q., and
Petrov, S. (2019). Natural questions: A benchmark for
question answering research. Transactions of the As-
sociation for Computational Linguistics, 7:452–466.
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma,
P., and Soricut, R. (2019). ALBERT: A lite BERT
for self-supervised learning of language representa-
tions. CoRR, abs/1909.11942. http://arxiv.org/abs/
1909.11942.
Lee, K., Chang, M.-W., and Toutanova, K. (2019). Latent
retrieval for weakly supervised open domain question
answering. In Annual Meeting of the Association for
Computational Linguistics (ACL), pages 6086–6096,
Florence, Italy.
Li, C. and Choi, J. D. (2020). Transformers to learn hierar-
chical contexts in multiparty dialogue for span-based
question answering. In Annual Meeting of the As-
sociation for Computational Linguistics (ACL), pages
5709–5714, Online.
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D.,
Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov,
V. (2019). Roberta: A robustly optimized BERT
pretraining approach. CoRR, abs/1907.11692. http:
//arxiv.org/abs/1907.11692.
Qu, C., Yang, L., Chen, C., Qiu, M., Croft, W. B., and Iyyer,
M. (2020). Open-retrieval conversational question an-
swering. In Conference on Research and Development
in Information Retrieval (SIGIR) , page 539–548, New
York, NY. Association for Computing Machinery.
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., and
Sutskever, I. (2018). Language models are unsuper-
vised multitask learners. Technical report.
Rajpurkar, P., Jia, R., and Liang, P. (2018). Know what you
don’t know: Unanswerable questions for SQuAD. In
Proceedings of the 56th Annual Meeting of the Associ-
ation for Computational Linguistics (Volume 2: Short
Papers), pages 784–789, Melbourne, Australia.
Reddy, S., Chen, D., and Manning, C. D. (2019). CoQA: A
conversational question answering challenge. Trans-
actions of the Association for Computational Linguis-
tics, 7:249–266.
Robertson, S. and Zaragoza, H. (2009). The proba-
bilistic relevance framework: Bm25 and beyond.
Foundations and Trends in Information Retrieval,
3(4):333–389.
Seo, M., Lee, J., Kwiatkowski, T., Parikh, A., Farhadi, A.,
and Hajishirzi, H. (2019). Real-time open-domain
Dynamically Generated Question Answering Evidence using Efficient Context-preserving Subdivision
283