TY - DATA T1 - Direct validation of imputed non-synonymous SNP alleles. PY - 2016/09/07 AU - Glendon J. Parker AU - Tami Leppert AU - Deon S. Anex AU - Jonathan K. Hilmer AU - Nori Matsunami AU - Lisa Baird AU - Jeffery Stevens AU - Krishna Parsawar AU - Blythe P. Durbin-Johnson AU - David M. Rocke AU - Chad Nelson AU - Daniel J. Fairbanks AU - Andrew S. Wilson AU - Robert H. Rice AU - Scott R. Woodward AU - Brian Bothner AU - Bradley R. Hart AU - Mark Leppert UR - https://plos.figshare.com/articles/figure/Direct_validation_of_imputed_non-synonymous_SNP_alleles_/3812199 DO - 10.1371/journal.pone.0160653.g001 L4 - https://ndownloader.figshare.com/files/5936502 KW - Hair Shaft Proteome Human identification KW - African population KW - non-synonymou KW - hair shaft datasets KW - PCR KW - Genetically variant peptides KW - 66 European-American subjects KW - mass spectrometry-based shotgun proteomics KW - nucleotide polymorphism profiles KW - hair shaft proteins KW - DNA KW - Protein-Based Human Identification KW - nucleotide polymorphism allelic profiles KW - nucleotide polymorphism alleles KW - hair shaft protein N2 - A) Genetically variant peptides (GVPs) that contained single amino-acid polymorphisms (SAPs) were identified in both European-American cohorts (EA1 and EA2) and collated for each subject. Imputed nsSNP alleles (Gene Name = GN, SNP accession number = rs#, allele nucleotide = nuc) were directly compared to the genotype resulting from direct Sanger sequencing (S1 Methods). Correctly imputed nsSNP alleles (TP, true positives) are indicated by a blue square. Imputed alleles that were incorrectly predicted (FP, false positive) are indicated by red squares. Alleles that were identified using Sanger sequencing, but did not contain a resulting GVP in the matching proteomic dataset (FN, false negative) are indicated by light green squares. Alleles absent in both subjects DNA and in resulting proteomic datasets (TN, true negatives) are indicated by white squares[49]. Failed Sanger sequencing determination of nsSNP allelic status is indicated by grey. B) The effectiveness of each SAP-containing peptide to impute nsSNP alleles was also quantified. The sensitivity of each genetically variant peptide, measured as the proportion of nsSNP-alleles that are correctly detected and imputed (TP/(TP+FN)), was calculated as a percentage (log10(%). The positive predictive value (PPV) of genetically variant peptide-based SNP imputations was calculated as the percentage of correct validated SNP imputations of all imputations (TP/(TP + FP); log10(%))[49]. C) ER -