figshare
Browse
ga_recommendation_SBES_2017.pdf (849.68 kB)

Tweaking Association Rules to Optimize Software Change Recommendations

Download (849.68 kB)
journal contribution
posted on 2017-09-06, 11:48 authored by Mairieli WesselMairieli Wessel, Maurício Finavaro Aniche, Igor Scaliante Wiese, Gustavo Ansaldi Oliva, Marco Aurélio Gerosa
Past researchs have been trying to recommend artifacts that are likely to change together in a task to assist developers in making changes to a software system, often using techniques like association rules. Association rules learning is a data mining technique that has been frequently used to discover evolutionary couplings. These couplings constitute a fundamental piece of modern change prediction techniques. However, using association rules to detect evolutionary coupling requires a number of configuration parameters, such as measures of interest (e.g. support and confidence), their cut-off values, and the portion of the commit history from which co-change relationships will be extracted. To accomplish this set up, researchers have to carry out empirical studies for each project, testing a few variations of the parameters before choosing a configuration. This makes it difficult to use association rules in practice, since developers would need to perform experiments before applying the technique and would end up choosing non-optimal solutions that lead to wrong predictions. In this paper, we propose a fitness function for a Genetic Algorithm that optimizes the co-change recommendations and evaluate it on five open source projects (CPython, Django, Laravel, Shiny and Gson). The results indicate that our genetic algorithm is able to find optimized cut-off values for support and confidence, as well as to determine which length of commit history yields the best recommendations. We also find that, for projects with less commit history (5k commits), our approach produced better results than the regression function proposed in the literature. This result is particularly encouraging, because repositories such as GitHub host many young projects. Our results can be used by researchers when conducting co-change prediction studies and by tool developers to produce automated support to be used by practitioners.

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC