figshare
Browse

Variant prediction in the age of Machine Learning

Version 2 2023-10-20, 20:52
Version 1 2023-07-17, 19:47
Posted on 2023-10-20 - 20:52 authored by Prabakaran Ramakrishnan

The items in this collections are part of the study titiled "Variant prediction in the age of Machine Learning". 



Abstract:

Over the years many computational methods have been created for the analysis of impact of single amino acid substitutions resulting from single amino acid variants (SNVs) in genome coding regions. Historically, all models have been limited by the inadequate sizes of experimentally curated datasets and by the lack of a standardized definition of impact. The emergence of protein language models (pLMs) had raised an important question: Can machines learn the language of life from the unannotated protein sequence data well enough to identify significant errors in the protein “words.” Our analysis suggests that some pLMs perform as well or better than existing supervised methods. pLM performance, however, varies by the type of impact desired as prediction. New methods of variant evaluation are particularly needed in the space where existing tools underperform. Consequently, further analysis is needed to establish their performance for “dark matter” proteins – those with no homologs in pLM training data.

CITE THIS COLLECTION

DataCite
3 Biotech
3D Printing in Medicine
3D Research
3D-Printed Materials and Systems
4OR
AAPG Bulletin
AAPS Open
AAPS PharmSciTech
Abhandlungen aus dem Mathematischen Seminar der Universität Hamburg
ABI Technik (German)
Academic Medicine
Academic Pediatrics
Academic Psychiatry
Academic Questions
Academy of Management Discoveries
Academy of Management Journal
Academy of Management Learning and Education
Academy of Management Perspectives
Academy of Management Proceedings
Academy of Management Review
or
Select your citation style and then place your mouse over the citation text to select it.

SHARE

email
need help?