posted on 2024-01-03, 01:15authored byGen Li, Sijie Yao, Long Fan
Protein thermodynamic
stability is essential to clarify the relationships
among structure, function, and interaction. Therefore, developing
a faster and more accurate method to predict the impact of the mutations
on protein stability is helpful for protein design and understanding
the phenotypic variation. Recent studies have shown that protein embedding
will be particularly powerful at modeling sequence information with
context dependence, such as subcellular localization, variant effect,
and secondary structure prediction. Herein, we introduce a novel method,
ProSTAGE, which is a deep learning method that fuses structure and
sequence embedding to predict protein stability changes upon single
point mutations. Our model combines graph-based techniques and language
models to predict stability changes. Moreover, ProSTAGE is trained
on a larger data set, which is almost twice as large as the most used
S2648 data set. It consistently outperforms all existing state-of-the-art
methods on mutation-affected problems as benchmarked on several independent
data sets. The protein embedding as the prediction input achieves
better results than the previous results, which shows the potential
of protein language models in predicting the effect of mutations on
proteins. ProSTAGE is implemented as a user-friendly web server.