Predictive performance of random forests models of the KEN (top) and KTZ (bottom) datasets.
A, C: ROC curves for the random forests models built on antibodies only (red line), exposure proxies only (blue line) and all variables (green line). Area under the curve (AUC) is given in their respective legend, whilst the dashed grey line represents an AUC = 0.5, that is, randomly guessing the individual’s status. B, D: Illustrated confusion matrices derived from models built on all available variables (Ab and Exp). The sizes of the diagonal wedges correspond to the true positive and true negative rates whereas the sizes of the off-diagonal wedges correspond to the false positive and false negative rates.