figshare
Browse

CHIGA's impact on defect prediction performance

figure
posted on 2025-04-10, 12:44 authored by Sheunopa CharumbiraSheunopa Charumbira

The study introduces CHIGA (Chi-square Genetic Algorithm), a hybrid metric selection technique that enhances metric selection and improves defect prediction performance. The performance of defect prediction models is often dependent on the dimensionality of metrics present in the dataset. This dependency leads to unexpected performance fluctuations. The variability in performance across datasets affects the generalizability of defect prediction models. CHIGA aims to enhance the generalizability and robustness of defect prediction models by enforcing effective metric selection. CHIGA achieves this by combining the chi-square technique for metric ranking and a binary-encoded genetic algorithm for feature subset selection.


CHIGA's performance is evaluated using five benchmarked datasets, nine metric selection techniques, and three classification algorithms. We employ analysis of variance, the area under the receiver operating characteristic curve, and several other statistical techniques to analyze and measure CHIGA's impact on defect prediction performance. The results of our investigation indicate that CHIGA performs competitively across various combinations of datasets and classification algorithms. Notably, CHIGA reduces the performance fluctuation rate by an average of 45.3\% when compared to nine established metric selection techniques. A significant reduction in the performance fluctuation rate enhances the reusability of defect prediction models. The ability to reuse and apply these models in different contexts is essential in an era where software underpins every facet of human life.

History

Department/Unit

Computer Science

Sustainable Development Goals

  • 9 Industry, Innovation and Infrastructure