Authors:
Faiz Ali Shah
;
Kairit Sirts
and
Dietmar Pfahl
Affiliation:
Institute of Computer Science, University of Tartu, J. Liivi 2, 50409, Tartu and Estonia
Keyword(s):
App Review Classification, Convolutional Neural Networks, Linguistic Resources, Bag of Words.
Related
Ontology
Subjects/Areas/Topics:
Artificial Intelligence
;
Knowledge Management and Information Sharing
;
Knowledge-Based Systems
;
Requirements Engineering
;
Symbolic Systems
Abstract:
User reviews submitted to app marketplaces contain information that falls into different categories, e.g., feature evaluation, feature request, and bug report. The information is valuable for developers to improve the quality of mobile applications. However, due to the large volume of reviews received every day, manual classification of user reviews into these categories is not feasible. Therefore, developing automatic classification methods using machine learning approaches is desirable. In this study, we compare the simplest textual machine learning classifier using only lexical features—the so-called Bag-of-Words (BoW) approach—with the more complex models used in previous works adopting rich linguistic features. We find that the performance of the simple BoW model is very competitive and has the advantage of not requiring any external linguistic tools to extract the features. Moreover, we experiment with deep learning based Convolutional Neural Network (CNN) models that have rece
ntly achieved state-of-the-art results in many classification tasks. We find that, on average the CNN models do not perform better than the simple BoW model—it is possible that for the CNN model to gain an advantage, a larger training set would have been necessary.
(More)