Authors:
László Tóth
1
;
Balázs Nagy
1
;
Dávid Janthó
1
;
László Vidács
2
and
Tibor Gyimóthy
2
Affiliations:
1
Department of Software Engineering, University of Szeged and Hungary
;
2
Department of Software Engineering, University of Szeged, Hungary, MTA-SZTE Research Group of Artificial Intelligence, University of Szeged and Hungary
Keyword(s):
Question Answering, Q&A, Stack Overflow, Quality, Natural Language Processing, NLP, Deep Learning, Doc2Vec.
Abstract:
Online question answering (Q&A) forums like Stack Overflow have been playing an increasingly important role in supporting the daily tasks of developers. Stack Overflow can be considered as a meeting point of experienced developers and those who are looking for a solution for a specific problem. Since anyone with any background and experience level can ask and respond to questions, the community tries to use different solutions to maintain quality, such as closing and deleting inappropriate posts. As over 8,000 posts arrive on Stack Overflow every day, the effective automatic filtering of them is essential. In this paper, we present a novel approach for classifying questions based exclusively on their linguistic and semantic features using deep learning method. Our binary classifier relying on the textual properties of posts can predict whether the question is to be closed with an accuracy of 74% similar to the results of previous metrics-based models. In accordance with our findings
we conclude that by combining deep learning and natural language processing methods, the maintenance of quality at Q&A forums could be supported using only the raw text of posts.
(More)