The impact of three different filtering methods on the retrieval performance of the weighted semantic network.
A–C, PPI retrieval performance (true positive rate or recall) is measured as the Area under the ROC Curve (ordinate). Panels D–F retrieval performance for known gene-disease associations. An AuC value of 0.5 indicates no retrieval power above random expectations. The weighted semantic network is filtered by incrementally removing generic information (heavy curve) from right to left or by incrementally removing specific information (inverse filtering, light curve) from left to right. Filter Threshold is indicated on the abscissa. Panels A, B, D, and E represent node filtering approaches while panel C and F represent edge filtering (see Method section for details). The red circle in panel A indicates the PPI retrieval performance (0.83) for a network where 99.52% of the nodes have been removed (i.e., all concepts occurring in 100,000 abstracts or fewer).