Figure_2.tif (434.03 kB)
Performance of CONC with Different Input Features
figure
posted on 2013-02-22, 08:44 authored by Jinfeng Liu, Julian Gough, Burkhard RostF-measures (harmonic mean of specificity and sensitivity; see Materials and Methods) were calculated for different SVMs for both the coding (A) and non-coding (B) predictions. Since the coding set was twice as big as the non-coding set, the percentage of incorrect predictions was bigger for the non-coding set, hence the smaller F-measures. When used individually, input features achieved F-measures of 67.6 to 90.9 on the non-coding set. Combining the features improved the performance to 97.4 for coding and 94.5 for non-coding. In comparison, ESTScan received F-measures of 86.7 and 69.9 for coding and non-coding predictions, respectively. The top-performing features were number of homologs in the protein database and peptide length.