Comparative model of novel coronavirus 2019-nCoV protease Mpro
datasetposted on 05.02.2020 by Christian C. Gruber, Georg Steinkellner
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
Comparative model summary: This is protease Mpro encoded in the RNA of the novel coronavirus 2019-nCoV. This virus first reported in 2019 and now spreading from Wuhan, China, the primary location outbreak. The protein sequence was determined based on the data from the Wuhan seafood market pneumonia virus genome (NCBI genome ID MN908947, GenBank: MN908947.3) published by Wu Et al. (LOCUS MN90894, 23-JAN-2020) by multiple sequences alignments with known SARS proteases. Aligning the sequence region of 2019-nCoV, “orf1ab polyprotein“ with the sequence of PDB 5N5O, the SARS main protease deposited by Zhang and Hilgenfeld in 2017 yield a protease with 306 amino acids. The sequence identity compared to 2H2Z, another SARS crystal structure, published by the Rao lab in 2006 is 96.1 %, similarity is 98.7 % with 100% aligned residues. The model is expected to be of high accuracy.
The crystal structure of the protease of our partner Prof. Yang group from ShanghaiTech is oficially released: https://www.rcsb.org/structure/6LU7 and now included in the dataset.