Erlotinib (left panel) and sorafenib (right panel) for log(pval_clinical) of the Pearson correlation coefficient for each training model’s prediction of the clinical response(x-axis) versus the log(pval_IC50) for the correlation coefficient of each model’s prediction of IC50 versus the mean of each gene’s expression in the training model (y-axis).
These results represent 20 million random picks of 30 tumor cells and 300 genes from the CGP database of IC50 values for erlotinib and sorafenib. For erlotinib, only 53 simulations achieved the arbitrary threshold requirements of log(pval_IC50) < -11, log(pval_clinical) < -6, ppvclinical < 0.45 and npvclinical<0.45 and. These models appear as the red circles in the left panel. For sorafenib only 48 simulations achieved the threshold requirements of log(pval_IC50) < -8.5, log(pval_clinical) < -8.5, ppvclinical < 0.65 and npvclinical < 0.65). Ppv and npv calculations require selection of a boundary between good and poor responses. These calculations use the mean of the predictive values as this boundary. Evident from this figure is the occurrence of training models with excellent correlative statistics that fail to meet the thresholds for ppv and npv.