figshare
Browse
1/1
6 files

Improving Change Prediction Models with Code Smell-Related Information

Download all (14.64 MB) This item is shared privately
dataset
modified on 2020-02-28, 10:59
Code smells represent sub-optimal implementation choices applied by developers when evolving software systems. The nagative impact of code smells has been widely investigated in the past: besides developers' productivity and ability to comprehend source code, researchers empirically showed that the presence of code smells heavily impacts the change-proneness of the affected classes. On the basis of these findings, in this paper we conjecture that code smell-related information can be effectively exploited to improve the performance of change prediction models, i.e., models having as goal that of indicating to developers which classes are more likely to change in the future, so that they may apply preventive maintenance actions. Specifically, we exploit the so-called intensity index---a previously defined metric that captures the severity of a code smell---and evaluate its contribution when added as additional feature in the context of three state of the art change prediction models based on product, process, and developer-based features. We also compare the performance achieved by the proposed model with the one of an alternative technique that considers the previously defined antipattern metrics, namely a set of indicators computed considering the history of code smells in files. Our results report that (i) the prediction performance of the intensity-including models is statistically better than that of the baselines and (ii) the intensity is a more powerful metric with respect to the alternative smell-related ones. Nevertheless, we observed some complementarities between the set of change-prone and non-change-prone classes correctly classified by the models relying on intensity and antipattern metrics: for this reason, we devise and evaluate a smell-aware combined change prediction model including product, process, developer-based, and smell-related features. We show that this model has an F-Measure that is up to 20% higher than the existing state-of-the art models.