pone.0297560.s002.tif (22.31 MB)

Alternative prediction method performance and dataset overlap.

figure

posted on 2024-01-25, 18:35 authored by Eli Fritz McDonald, Kathryn E. Oliver, Jonathan P. Schlebach, Jens Meiler, Lars Plate

A. Receiver operating characteristic curve for ESM predictions [39] of 169 CFTR missense variants including 110 CF causing, 41 variable clinical consequence (VVCC), and 18 non-CF causing variants. For the pathogenic curve (violet), we considered a pathogenic prediction of a CF-causing variant a true positive. For the ambiguous curve (grey)—we considered the ambiguous prediction a VVCC a true positive. For the benign curve (bluegreen)–we considered the benign prediction of a non-CF causing variant as a true positive. B. Receiver operating characteristic curve calculated the same as in A. but using EVE missense variant predictions [40] of 169 CFTR missense variants colored as shown in A. C. Venn diagrams depicting the overlap of various datasets used throughout the study. We considered our expanded ClinVar dataset, the deep mutational scanning (DMS) dataset [23], our curated CFTR2 dataset, and the missense variants from the Bihler et al. dataset [33].

(TIF)