Prediction of Synaptic Connectivity Signatures as a Function of the Most Informative Genes
The accuracy of the predictor as a function of the number of genes selected for the predictor is described by the blue line. Prediction accuracy is measured by AUC. The top panel shows the outgoing connectivity results, and the lower panel shows the incoming connectivity results. The rightmost point (289 genes) denotes the prediction outcome before any feature selection is applied to the data. The blue line represents 5-fold cross-validation repetitions of the selection–prediction scheme (mean and standard deviations are displayed). The red dashed lines represent the empirical null hypothesis distribution of performing the selection–prediction scheme on random data (constructed by shuffling the identities of the neurons, see Materials and Methods). Maximum AUC measurements are achieved with 53 and 30 features in the incoming and outgoing assays, respectively, with corresponding p-values of p = 10−99 and p = 10−97, calculated by applying a one-sided t-test between the original and shuffled data (see Materials and Methods).