Authors:
Thomas Karanikiotis
;
Michail D. Papamichail
;
Ioannis Gonidelis
;
Dimitra Karatza
and
Andreas L. Symeonidis
Affiliation:
Electrical and Computer Engineering Dept., Aristotle University of Thessaloniki, Intelligent Systems & Software Engineering Labgroup, Information Processing Laboratory, Thessaloniki, Greece
Keyword(s):
Developer-perceived Readability, Readability Interpretation, Size-based Clustering, Support Vector Regression.
Abstract:
In the context of collaborative, agile software development, where effective and efficient software maintenance is of utmost importance, the need to produce readable source code is evident. Towards this direction, several approaches aspire to assess the extent to which a software component is readable. Most of them rely on experts who are responsible for determining the ground truth and/or set custom evaluation criteria, leading to results that are context-dependent and subjective. In this work, we employ a large set of static analysis metrics along with various coding violations towards interpreting readability as perceived by developers. In an effort to provide a fully automated and extendible methodology, we refrain from using experts; rather we harness data residing in online code hosting facilities towards constructing a dataset that includes more than one million methods that cover diverse development scenarios. After performing clustering based on source code size, we employ S
upport Vector Regression in order to interpret the extent to which a software component is readable on three axes: complexity, coupling, and documentation. Preliminary evaluation on several axes indicates that our approach effectively interprets readability as perceived by developers against the aforementioned three primary source code properties.
(More)