Human prion protein: exploring the thermodynamic stability and structural dynamics of its pathogenic mutants

Abstract Human familial prion diseases are known to be associated with different single-point mutants of the gene coding for prion protein with a primary focus at several locations of the globular domain. We have identified 12 different single-point pathogenic mutants of human prion protein (HuPrP) with the help of extensive perturbations/mutation technique at multiple locations of HuPrP sequence related to potentiality towards conformational disorders. Among these, some of the mutants include pathogenic variants that corroborate well with the literature reported proteins while majority include some unique single-point mutants that are either not explicitly studied early or studied for variants with different residues at the specific position. Primarily, our study sheds light on the unfolding mechanism of the above mentioned mutants in depth. Besides, we could identify some mutants under investigation that demonstrates not only unfolding of the helical structures but also extension and generation of the β-sheet structures and or simultaneously have highly exposed hydrophobic surface which is assumed to be linked with the production of aggregate/fibril structures of the prion protein. Among the identified mutants, Q212E needs special attention due to its maximum exposure of hydrophobic core towards solvent and E200Q is found to be important due to its maximum extent of β-content. We are also able to identify different respective structural conformations of the proteins according to their degree of structural unfolding and those conformations can be extracted and further studied in detail. Communicated by Ramaswamy H. Sarma


Introduction
Transmissible spongiform encephalopathies (TSE) (Prusiner, 2001;Ross & Poirier, 2004;Watts & Prusiner, 2017;Zhong, 2012), or prion diseases, are a rare group of neurological disorders of humans and animals characterized by spongiform neurodegeneration of the brain caused by prion proteins. Amyloid deposits may parallel the pathology and primarily consist of the aberrant, misfolded form of the cellular prion protein (PrP C ) named as PrP Sc (Ross & Poirier, 2004;Tycko & Wickner, 2013;Zweckstetter et al., 2017). According to the 'prion only' hypothesis, the key event of prion pathogenesis in TSE diseases involves conformational conversion of the prion protein (PrP) from its cellular isoform (PrP C ) to a disease associated scrapie isoform (PrP Sc ) (Jenkins et al., 2008;Legname, 2012;Sanz-Hern andez et al., 2021). This conversion includes a transition of a-helix structures into b-sheets. However, the underlying processes of conversion of a properly folded prion protein into b-rich scrapie isoform are not explicitly determined. Also, decrease in pH, an increment of temperature, and introduction of mutations at several conserved positions of the wild-type (WT) and recombinant prion protein are reported as promoting factors of the conversion into PrP Sc (van der Kamp & Daggett, 2009, 2010a, 2010b. The major differences that discriminate PrP Sc from the cellular form are partial protease resistance, amyloidogenicity, and the higher content of b-sheet (Lee et al., 2010;Legname, 2012). Up to now, several useful models have been proposed for PrP Sc , but the tertiary structure of PrP Sc has not been determined yet.
The human prion protein (HuPrP) is a glycoprotein consisting of 209 residues and is anchored by a C-terminal glycosylphosphatidylinositol to the outer leaflet of the plasma membrane of the nerve cell and is highly conserved among mammals (Zahn et al., 2000). Although PrP C is associated with several processes in the nervous system, its actual physiological function remains elusive (Corsaro et al., 2012;Martins et al., 2001). Several structural studies have been carried out on WT and recombinant PrP molecules and their mutants (Behmard et al. 2011(Behmard et al. , 2012Chen et al., 2014;Cheng & Daggett, 2014;D'Angelo et al., 2012;Zahn et al., 2000Zahn et al., , 2003. The presence of a copperbinding octapeptide repeat suggests a role in copper metabolism (Brown & Sassoon, 2002;Gasperini et al. 2016), as cultured PrP C deficient cells have been shown to possess a reduced copper content, lower superoxide dismutase activity and increased susceptibility to oxidative stress (Soprana et al., 2011;Wong et al. 2001). Moreover, PrP C -null mice show enhanced protein and lipid oxidation. Active uptake of PrP C by clathrin-coated pits and caveolae-like vesicles indicates a role in copper homeostasis using a copper sink or as a carrier for other copper-requiring cytosolic proteins (Sakudo et al., 2004). There is a significant abundance of PrP C in the hippocampus, a region assumed to be involved in memory function (Halliday & Mallucci, 2015;Si & Kandel, 2016), suggesting a role in memory formation for PrP C . Another suggested role derived from neuronal studies on cells from PrP-null mice is neuroprotection. Prion proteins (PrP C ) from different species are extensively studied both experimentally and theoretically and their structures are determined using either NMR spectroscopy or X-ray crystallography or both (Antonyuk et al. 2009;Calzolai et al., 2005;Knaus et al. 2001;Lysek et al. 2005;Zahn et al., 2000). The fact is pathogenic mutations of such proteins may lead to misfolded conformations of the cellular protein, which eventually turn into aggregates and ultimately become a part of a well ordered fibrillar structure (Cardone et al. 2014;Rossetti et al. 2010Rossetti et al. , 2011. Therefore, it is important to study the diseasecausing pathogenic forms of prion protein with the help of molecular dynamics (MD) simulations. Effect of such mutations on the structural stability and energetics for different proteins and complexes have been studied for long by using various methods like all-atom explicit MD simulations or coarse grained simulations or Replica exchange or SMD simulations. (Choi et al. 2017;De Mori et al. 2004;Ju arez-Jim enez et al. 2020;Montefiori et al. 2019;Morra & Colombo, 2008;Mykuliak et al. 2020;Paladino et al. 2020;Serapian & Bo, 2016).
In this study, we perform a thorough protein design and sequence-based analysis followed by MD simulation of the HuPrP sequence (PDB ID:1QLX) (Zahn et al., 2000) to generate series of single-point pathogenic mutants of HuPrP. Some other group used simple, molecular dynamics-based, energy decomposition approach to map the principal energetic interactions in set of proteins representative of different folds. In our study, we iteratively perturb the selected protein at multiple locations in term of mutations to explore its likelihood towards conformational disorders (Behmard et al., 2011(Behmard et al., , 2012Cardone et al. 2014;Chen et al. 2014;Cheng & Daggett, 2014;Dutta et al. 2013;Rossetti et al. 2010;Zhong, 2012). As a result, we generate 12 pathogenic variants of which E196K, V210I and E211Q corroborate with the literature (Behmard et al. 2012;Cheng & Daggett, 2014;, and rest are unique mutants and are studied by us. We perform extensive all-atom MD simulations of HuPrP (PDB ID: 1QLX) (Zahn et al., 2000) with an explicit solvent model at room temperature (300 K) under constant pressure and temperature condition. The potential energy of the system matches well with the literature (Mansouri et al. 2013). Next, we perform MD simulations both at room temperature as well as at higher temperatures with similar simulation conditions on the native protein and the selected mutant structures for 100 ns each. In addition, we carry out SMD simulation of the native and the mutant proteins as an alternative approach to study protein unfolding dynamics. Elevated temperature facilitates efficient monitoring of the unfolding dynamics of the protein and especially its mutants in a short period (El-Bastawissy et al. 2001;Shamsir & Dalby, 2005). MD simulation of prion protein at normal (300 K) and elevated (500 K) temperature performed by other researchers showed that secondary structure of the protein remains reasonably stable at both the temperature (El-Bastawissy et al. 2001). Our analysis indicates that the native protein shows superior flexibility or structural dynamics (more specifically in the loop regions) though the major secondary structures get preserved, which corroborates well with other existing report .
On the contrary, few mutants show significant structural perturbations compared to that of the native protein even at room temperature. With the increase in simulation temperature, structural perturbations of different secondary structures of the mutants increase but with varying extents. Following the hightemperature simulations of different mutants, it becomes evident though the site of mutations that vary from each other but these share similar unfolding patterns as found in some of the previous studies with some prion protein mutants (Cheng & Daggett, 2014). SMD study of the native as well as mutant proteins also confirm similar findings mentioned above. Here, we can monitor the early unfolding steps of the native as well as the 12 different mutants of HuPrP in depth. We also aim at locating different important mutation sites of HuPrP that are related to structural instability and the misfolding phenomenon of the native structure. Thus, we can evaluate the role of specific amino acid residues at specific positions of the protein and follow the structure-function relationship of different pathogenic mutants of HuPrP. The primary goal of our study is to identify single point mutations of HuPrP that are capable of distorting the native structure of the protein to appreciable extents. This not only involves the unfolding of the helical structures but also includes extension and generation of the beta-sheet structures and increased exposure of the hydrophobic surface as well in some cases, which is assumed to be the precursor for generation of aggregate structures. Among the identified mutants some have been studied early (E196K, V210I and E211Q) (Behmard et al. 2012;Cheng & Daggett, 2014) and few are not explicitly studied before (Y145W, Q160R, H187Y, E196Q, E200Q, R208K/Q, Q212E and E219Q). The mutant Q212E needs special attention due to its maximum exposure of the hydrophobic core towards solvent whereas E200Q contains maximum extent of b-structures during unfolding. Besides, our study helps to identify different protein conformations (native-like, intermediate like and unfolded like) of the Wt protein and its mutants and these structures can be extracted and further studied in details.

Materials and methods
As revealed by its atomic structure, the mature PrP C expressed by different mammals are structurally very similar Lysek et al. 2005) though variations in local sequence and structure are mostly observed in the region of the b 2 -a 2 loop and the C-terminal part of the a 3 helix within PrP C of different species. The NMR structure of the mature recombinant HuPrP (Zahn et al., 2000) consists of a flexible N-terminal tail (residues 23-124) and a well-structured C-terminal region (residues 125-231) as shown in Figure 1. The globular core of wild type (WT) HuPrP contains three a-helices (a 1 : residues 144-154; a 2 : residues173-194; a3: residues 200-228), two short antiparallel b-sheets (b 1 : residues 129-131; b 2 : residues 161-164) (Figure 1), and an intramolecular disulfide bridge between Cys-179 and Cys-214 residues linking helices a 2 and a 3 . The N-terminal portion of HuPrP is highly flexible (Hara & Sakaguchi, 2020) and largely disordered compared to the C-terminal core that contains stable secondary structure elements.

Working methodology
The flowchart of the entire methodology has been shown in Figure 2. We have employed EvoDesign , an evolution profile-based de novo protein design method. The method offers several options for users to select different guiding force fields, structural thresholds for profile construction and residue conservations. The execution time of the server is fast and scales in hours because of the quick convergence of the simulation search under the profile restraints. EvoDesign is established as an automated, and yet reliable, on-line facility most useful for protein engineering and drug discovery studies as verified by several experimental studies like Rajesh et  EvoDesign is an ab initio protein design tool that takes one protein structure as an input and output a number of novel alternative protein sequences which is supposed to fold to the input structure. EvoDesign consists of three phases-pre-processing, simulation, and clustering. In the pre-processing stage it gathers the protein structures homologous to the input structure and then computes one position specific scoring matrix (PSSM) after aligning the sequences of the homologous protein structures. During the simulation phase, with the help of PSSM and energy function generates a number of decoy sequences which are then clustered at the final stage to output a number of possible design sequences.
We have used the protein design algorithm as developed by , where a Metropolis Monte Carlo (MC) simulation is the core of the algorithm. At each step of the MC simulation, a set of randomly selected residues are mutated randomly. Next their fitness for the acceptance/ rejection is decided by a hybrid (evolutionary-based and physics-based) score function along with the Metropolis criterion. Thus, for our case the mutation sites are selected randomly. By employing this software, we can predict whether a given protein (here Wt HuPrP [PDB ID:1QLX] (Zahn et al., 2000)) mutated at some specific location will lead to protein conformational disorders or not. We have also carried out pairwise sequence alignment of the target protein sequence with other closely related sequences of literature-existed proteins (obtained from BLASTP) (Liu et al. 2011). Then we compare the results obtained from two different methods mentioned above for the same protein sequence and selected the sequences that are common for both the methods at specific residue positions. Using I-TASSER (Wu et al. 2007), we model these finally selected single point mutated sequences.
I-TASSER is widely used protein folding web service that takes one protein sequence as an input and outputs atomic model of that sequences. For our design sequence, we used I-TASSER to model the protein structure. After successful modeling I-TASSER outputs C-score corresponding to each model. C-score is a measure for fold level accuracy that varies from À5 to 2 with higher the value better is the model. Table 1 includes RMSD and TM-score values of model structures of the mutants with respect to the native protein structure computed with TMalign software (Zhang & Skolnick, 2005). High TM-score values of the mutants represent that these follow similar fold pattern with the native protein.
Next, we perform MD simulations of the native and the single point mutants for 100 ns each to determine their detailed dynamical behaviour. We use GROMACS 5.1.4 (Hess et al. 2008) with the OPLS-AA force field for simulating protein molecule (Kaminski et al. 2001;Price & Brooks, 2002;Shirts et al., 2003). We consider HuPrP (PDB ID:1QLX) and its single point pathogenic mutants Y145W, Q160R, H187Y, E196K/Q, E200Q, R208K/Q, V210I, E211Q, Q212Eand E219Q (similar to protective polymorphic mutant E219K) for this study. Among these, some of the similar mutations are studied earlier by different groups like Y145stop, Q160stop,  Schmitz et al. 2017;Zhong, 2010). Rest are relatively less studied or never studied before.
For the WT HuPrP, we use the NMR structure of HuPrP at pH 4.5 (PDB ID: 1QLX, residues 23-230, shown in Figure 1) that carries the methionine polymorphism at codon 129 (M129). The structures of the single point disease-related HuPrP mutants have been generated inserting the required mutation at the respective position of the 1QLX sequence. We model the mutated sequences using I-TASSER since the high-resolution structures do not exist for these to date due to their inherent structural instability. Existing NMR structures of few mutants have also been considered (Biljan et al. 2011(Biljan et al. , 2012Ilc et al. 2010) for MD simulations and the results are compared with the simulation results of the model structures. We maintain the same methionine polymorphism at codon 129 (M129) of the mutants as well.  All the systems are simulated in explicit aqueous solution, inserted into a cubic box (box vector 1.0 nm) of water molecules. The number of water molecules into the cube varies within a range of 9,000 to 10,000 or even more for different systems. The OPLSAA force field (Jorgensen & Tirado-Rives, 1988;Kaminski et al. 2001;Price & Brooks, 2002;Shirts et al., 2003) is used for the protein, in combination with the AMBER-adapted Aqvist potential for the counter ions and the TIP3P force field for the water. Neutral pH conditions are realized by setting the protonation states of the ionisable residues according to their pKa values. Specifically, the counter ions used here are the Na þ ions. Simulations are performed with periodic boundary conditions in the NVT followed by NPT ensemble for the low-temperature simulations, with temperatures and pressure kept close to the desired value (T ¼ 300K, p ¼ 1 bar) through the Berendsen coupling schemes, respectively. Long-range electrostatic interactions are treated with the particle mesh Ewald (PME) method, using a grid with a spacing of 0.16 nm, combined with a fourth-order cubic spline interpolation to compute the potentials and forces in between grid points. The cut-off radius for the Lennard-Jones interactions, as well as for the real part of PME calculations, is set to 1.0 nm. The LINCS algorithm is used to constrain all bond lengths involving hydrogen atoms and the time step used is 2fs. The systems are energy-minimized through the steepest descent method imposing harmonic position restraints of 1000 kJ mol À1 nm À1 on solute atoms, allowing the equilibration of the solvent without distorting the solute structure. After an energy minimization of the solvent and the solute without harmonic restraints, the temperature is increased from 0 to 300 K. Multiple simulations are performed for each system with the same model structure. However, we assign statistically independent initial velocities harvested from a Maxwell distribution at the appropriate temperature (Bussi et al., 2007). For higher temperature simulations (400 K, 500 K), the temperature of the systems has been raised in a stepwise fashion at a temperature interval of 25 K and 50 K, respectively, starting from the pre-equilibrated trajectories of the same system at a lower temperature (300 K). We followed the simulation of the protein at different temperatures to explore the variation of protein dynamics with respect to temperature and compare the results of different temperature simultaneously. At each temperature level, the system has been well equilibrated before further increment of temperature at a higher level. All simulations are performed and analysed with the GROMACS software package 5.1.4 (Hess et al. 2008). Several Gromacs utilities have been utilized for the analysis of the simulated trajectories (detailed information are included in the supporting information data).

SMD simulation
This method has been used as an alternative approach for studying unfolding dynamics of WT HuPrP and its different mutants by using NAMD software (Phillips et al. 2005). Like MD simulation, all the systems have been fully immersed in a cubical water box. Followed by minimization and equilibration under NPT condition, mechanical force has been applied to the C a -atom of the last residue of the C-terminal end keeping the other end (N-terminal) fixed throughout the simulation. C a atom of the C-term end of each protein is being pulled by moving a spring (with force constant 7 kcal mol À1 Å À2 ) with a constant velocity in a range of 0.0025 Å/timestep.

Overall structural properties from MD simulations
We present the 100 ns MD simulation results of the WT HuPrP (PDB ID:1QLX) and a series of its single point pathogenic mutants Y145W, Q160R, H187Y, E196K/Q, E200Q, R208K/Q, V210I, E211Q, Q212E and E219Q that is likely to be similar with protective polymorphic mutant E219K. The statistical accuracy of our results is investigated by performing multiple independent simulations for each system. The results are found to be comparable for each set of runs.
Here, we primarily present the results of one of the multiple simulations. Nevertheless, all the details are furnished in Supporting Information, in case, we note any variations among the multiple simulations.
3.1.1. Overall stability and conformational flexibility C a -trace root-mean-square deviation (C a -RMSD) is an indicator of stability and conformational flexibility of proteins. Since the core structure exhibits similar dynamics to that of the overall protein structure, so we exclude few terminal residues and focus from residue 125 to 228. Figure 3(a,b) show the C a -RMSD of WT HuPrP and the mutants at 400 K and 500 K respectively with respect to their respective initial structures. Native protein and some of the mutants studied show significantly higher C a -RMSD values e.g. Y145W, Q160R, E196Q/K, E200Q, Q212E at 500 K compared to that of the wild type protein at 300 K and 400 K, whereas other mutants show relatively lower C a -RMSD values at 500 K. C a -RMSD values of WT HuPrP and the mutant proteins at different temperatures are represented individually in Figure S1(a,b) of the supporting information. This clearly shows that the studied mutants exhibit more or less significant structural deviations with respect to that of the native protein at low temperature.
As illustrated in Figure 4(a-f), WT HuPrP remains rather stable during the native state simulation at lower temperature. While the temperature gets increased, the protein unravels rapidly (Figure 4). Figure 4(g-i) show significant structural dynamics of the helices and b-sheets and the associated regions of the native protein compared to that observed at lower temperature. 3.1.2. Residue-wise conformational flexibility C a -trace root-mean-square fluctuation (C a -RMSF) represents the residue-wise structural flexibility of the proteins for all simulations. Figure 3(c,d) shows the RMSF of C a atoms of WT HuPrP and the mutant proteins at 400 K and 500 K as a function of residue number. Each simulation of WT HuPrP at 300 K exhibits a groove pattern that supports the observations by . In addition to the expected flexibility of N-and C-termini of the globular domain of WT HuPrP, structural fluctuations prominently found at the loop before a 2 (165-172) and a 3 (193)(194)(195)(196)(197)(198)(199) and relatively less before a 1 (137-150) even at 300 K. The lowest fluctuation is observed in the region of the disulfide bridge (specific portions of a 2 and a 3 ). At 400 K, the C a -RMSF shows an overall increase in fluctuations especially in the regions mentioned above [Figure 3(c)]. The C a -RMSF of WT HuPrP and the mutants at 500 K indicates even more structural perturbations or decreased thermal stability compared to the mutants of WT HuPrP at 400 K [ Figure 3 3.1.3. The radius of gyration (R g ), overall number of hydrogen bonds and overall solvent accessible surface area (SASA) The distributions of the radius of gyration (R g ) of WT HuPrP and the mutants at 400K and 500K are shown in Figure  S2(a,b) of the supporting information, respectively as a function of time. Figure S2(a) highlights quite steady R g distribution with minor fluctuations for few e.g. for E196Q, V210I mutants at 400K. This indicates that the global structures of the native and mutant proteins are more or less conserved light violet þ symbol represents Wt HuPrP at 400 K, green Â symbol represents E219Q, cyan Ã symbol represents Q212E, brown Ã Á symbol represents E211Q, yellow filled Ã Á symbol represents V210I, blue o represents R208K, red filled o represents R208Q, black D represents E200Q, Deep violet filled D represents E196K, green r represents E196Q, cyan filled r represents H187Y, brown^represents Q160R and yellow filled^represents Y145W and in (b and d) light violet þ symbol represents Wt HuPrP at 400 K, green Â symbol represents Wt HuPrP at 500 K, cyan Ã symbol represents E219Q, brown Ã Á symbol represents Q212E, yellow filled Ã Á symbol represents E211Q, blue o represents V210I, red filled o represents R208K, black D represents R208Q, Deep violet filled D represents E200Q, green r represents E196K, cyan filled r represents E196Q, brown^represents H187Y, yellow filled^represents Q160R and light blue o represents Y145W; Time evolution of C a -RMSF of WT HuPrP and the mutants as a function of residue number at (c) 400 K and (d) 500 K. at this temperature. On the contrary, the distribution becomes more extended/flat for the same at 500K. The values of R g increase from the range of 1.2 nm to 1.9 nm and the mutants like Q212E, Q160R, E196Q, E200Q, V210I and E211Q exhibit significantly higher R g values compared to other mutants at 500 K and native protein at a lower temperature. The downward trend of R g confirms the compactness of the structures of the related proteins with respect to the others. Figure S2(c,d) of the supporting information present time variation of overall number of hydrogen bonds of WT HuPrP and its mutants at 400K and 500K, respectively. From these figures, this is evident that the decrease of overall number of hydrogen bonds of the mutants is more prominent at 500K than that at 400K as compared to the native protein. Figure 8(a,b) present time variation of overall solvent accessible surface area (SASA) of WT HuPrP and its mutants at 400K and 500K, respectively. The overall SASA of the mutants increase in the range of 65 to 95 nm 2 at 500K whereas the increase of SASA remain in the relatively stable range at 400 K.

Changes of secondary structure
Plots of changes in the secondary structure of WT HuPrP and few representative mutants E200Q, R208Q, E211Q and Q212E are shown in Figure 5 as determined by gmx do_dssp from MD simulations. Evolutions of secondary structures for other mutants during the courses of the simulations are included in Figure S3 of the supporting information. A thorough investigation of these plots of WT HuPrP and the mutant structures shows a distinctively different pattern for each individual. WT HuPrP possesses a stable secondary structure at 300 K and even at 400 K [ Figure 5(a)]. In contrast, the native protein shows significant structural perturbations at 500 K [ Figure 5(b)]. All the helices become disrupted more or less compared to that of WT HuPrP at low temperature with a 1 helix, the least, and a 3 helix mostly disrupted structures. Native b-sheet structures get extended and new b-sheets form especially at the loop between a 2 and a 3 helices and the end of a 3 helix [ Figure 5(b)]. Figure 5(c) represents changes in the secondary structure of E200Q with time at 500 K. The changes are similar to the changes observed for WT HuPrP at 500 K but with a relatively greater extent for a-helices and reduced extents for b-sheets. R208Q show similar structural dynamics for helices as observed for E200Q and new b-structures appear in the loops between a 1 and b 2 and end of a 3 helix [ Figure 5(d)]. E211Q and Q212E exhibit comparable structural dynamics for the helices but little different for the sheets at 500 K [ Figure  5(e,f)]. As shown in [ Figure 5(b-f)], portions of several helices and the associated loops convert into b-sheets. Thus we observe significant structural variations of the helices and b sheets with different extents for the studied mutants during the simulation. Structural transitions of helices into sheets is known to be associated with the process of conversion of prion protein from its cellular form (PrP C ) to disease-associated form (PrP Sc ) and thus related with the misfolding diseases. Along with the extension, the formation of new b-sheet structures are also observed in proteins like Q160R, H187Y, E200Q, R208K/Q, V210I, Q212E and E219Q that appear mostly in the regions associated with a 2 or a 3 helix (Figures 5 and S3 of the supporting data). But, few shows formation of new b-sheet structures nearby the region of a 1 helix compared to the others like E200Q, R208Q and Q212E. The variant E219Q (similar to the protective polymorphic mutant) also shows extensive dynamical fluctuations of its secondary structural units as others.

Principal component analysis (PCA)
PCA analysis of protein structures shows about 40% of the total mean-square displacement of C a atom positional fluctuations is captured in the first few principal components [ Figure 6(b,c)]. It also allows testing of different classes of structures in terms of differential loading of properties in the first few principal components. In the present work, the properties are normalized for each set of simulations as described by Kazmirski and others (Kazmirski et al. 1999). Plotting HuPrP structures onto PC1 and PC2 characterizes conformational relationships between different HuPrP structures. By plotting the differential loadings along PC1 and PC2 for different structures of the native and some of the representative mutant proteins like E196Q, E211Q and Q212E at 500 K [ Figure 6(e-h)], we observe data points form 2 to 3 major clusters. WT HuPrP and the mutants like E211Q, Q212E show clustering of data into relatively lower value regions [ Figure  6(e,g,h)] whereas proteins such as E196Q shows similar clustering of data into higher value region [ Figure 6(f)]. WT HuPrP and the mutants e.g. E211Q, E196Q show 2 major clusters whereas the other mutant show 3 or more broad major clusters of protein conformations.
Thus PCA analysis gives rise to three or two different sets of structurally unrelated conformations of the protein. Some other significant results of the PC1 vs PC2 plots of mutants are included in the supporting information ( Figure S4). This clearly indicates protein structures for some of the mutants (E196Q, E200Q, E211Q) belong to at least 3 or more major clusters at 400 K. Most importantly, the structures of the same mutants belong to 2 major clusters at 500 K. This shows that with the increment of temperature, structures can be classified into fewer number of clusters. In other words, structures obtained from the high temperature simulations represent extensively unfolded proteins compared to the structures obtained from the low temperature simulation and in some cases, the intermediate structures observed at low temperature are not at all visible at high temperature.

Conformational clustering
Extensive all-atom simulations of the mutants at different temperatures generate a variety of structures that deviate significantly from the starting NMR structure of the native protein. Conformational clustering is used to locate ensembles of similar structures with the initial structure of the MD trajectory in each case and distributions of the conformations are obtained as a function of the respective C a RMSD values. Figure 7(a,b) shows C a RMSD distribution curves of WT HuPrP and its mutants at 400 K and 500 K respectively. C a RMSD values of the proteins at 400 K peak at the lower end i.e. mainly around 0.4 nm and some in 0.7 nm range [ Figure  7(a)]. On the other hand, C a RMSD values of the proteins at 500 K peak around lower to the higher end of C a RMSD values such as 0.35 nm, 0.7 nm, and more than 1.0 nm and more than 1.3 nm (for few). This is also noteworthy, that the C a RMSD distributions of mutant protein structures at 400 K resemble the distribution of C a RMSD values of Wt HuPrP at the same temperature. On the contrary, along with the native-like distribution of C a RMSD values ($0.3 nm), the mutants show other peaks around relatively higher C a RMSD values (0.6-0.7 nm) [ Figure  7(b)]. Most interestingly, mutants at 500 K show peak at (0.9-1.1 nm) along with the other two peaks as described for the mutant proteins at 400 K. It might be concluded from this study that the proteins show one stable set of conformations that resemble the stable conformations of Wt HuPrP at lower temperature and another set of conformations that represent relatively unfolded set of protein conformations at 400 K. At higher temperature C a RMSD value distribution represents a significantly unfolded set of conformations along with the other two major distributions of conformations as observed for proteins at low temperature. By using conformational clustering technique with C a RMSD as the clustering parameter of prion protein and mutants, we are able to identify 3 distinctively different sets of protein conformations all together, one that resembles the native-like state, other that Figure 8. Time variation of overall SASA of WT HuPrP and the mutants at (a) 400 K and (b) 500 K and time evolution of SASA of hydrophobic core region of the same at (c) 400 K and (d) 500 K.Time variation of SASA of individual residues in the hydrophobic core of WT HuPrP and the mutants at 400 K (e) and 500 K (f). Same color scheme is used for the proteins as in Figure 3. mimics conformations of protein unfolding intermediates and the last one spans the conformations more likely to be of extensively unfolded proteins. Thus the different conformations of proteins can be extracted and further studied which is beyond the scope of the present work.

Specific structural properties from MD simulations (in terms of different parameters)
3.3.1. C a -root-mean-square deviation (C a -RMSD) of major structural units Figure S5 of the supporting information depicts the C a -RMSDs of the helices (a 1 , a 2, and a 3 ) and the sheets (b 1 and b 2 ) that are obtained at 500 K. This figure also justifies that C a -RMSD of all the helices increases with time, and the maximum structural perturbations are observed for a 2 and a 3 helices . In contrast, C a RMSD of b-sheets remains stable with minor fluctuations. Thus, a-helices are more prone towards unfolding compared to that of b-sheets in prion protein and its mutants as reported early for some other mutants (Ji et al. 2005).

SASA of the hydrophobic core region
Despite the large structural deviations observed in the hightemperature range, there is a notable region in the protein that retained its original structure throughout the simulation. This region is comprised of a segment in a 2 (residues 175-185), and is in close contact with a segment in a 3 (residues 210-220). The inter-residue contacts between these two structural segments are not only dense but also stable (Figure 3), as reported from the previous study ). The RMS fluctuations in Figure 3(c,d) also indicate the higher stability of this region. The conformation of this region is stabilized by the disulfide bridge (between Cys179 of a 2 and Cys214 of a 3 ) and by the hydrophobic interactions among several hydrophobic residues (134, 137, 139, 141, 158, 161, 175, 176, 179, 180, 184, 198, 203, 205, 206, 209, 210 and 213-215). The first few residues of the hydrophobic core of the protein span a 1 and associated portions (134, 137, 139, 141, 158 and 161), whereas the others lie on the hydrophobic regions of a 2 and a 3 . This inter-residue hydrophobic interactions help a 1 to get associated with other distant helices and supports to maintain the stability of the globular domain of prion protein.
The time evolution of overall SASA and the SASA of the hydrophobic core of WT HuPrP and the mutants at 400 K and 500 K are shown in Figure 8(a-d). The increase of SASA values of the hydrophobic core is more prominent than the increase of SASA of the overall protein even at 400 K with respect to the native protein at 300 K. The mutants like E196K/Q, Q160R, Q212E, V210I and R208Q show significant exposure of the hydrophobic surface compared to the wild type protein at 400 K. Figure 8(e,f) represents a variation of SASA of each residue of the hydrophobic core with time at 400 K and 500 K, respectively. Among these, some show a significant increase of SASA compared to that of the native protein, specifically at 500 K. Large change in hydrophobic SASA values or SASA values for some specific residues of the hydrophobic core confirm exposure of the hydrophobic surfaces for some of the mutants prone towards unfolding (Cheng & Daggett, 2014;van der Kamp & Daggett 2010b) during simulations. The differences in hydrophobic surfaces mainly lie in the residues 134,141,158,175,184,198,203,205,209,213 and 215 [Figure 8(f)] compared to the native protein. As we know that residues 134, 141; 158; 175 and 198 lie in the loop regions connecting b 1 -a 1, a 1-b 2, b 2 -a 2 and a 2 -a 3 , respectively, and residue 184 on a 2 and residues 203, 205, 209, 213 and 215 belong to a 3 helices. Thus our study reaffirms the flexibility of structures in these loop regions and part of a 2 and a 3 helices of prion protein and that become exposed during simulations.
3.3.3. Changes in the secondary structure of major structural units Figure 9 shows a percentage of residues in a particular secondary structure of the mutant as a function of simulation time compared to that of the native protein. The unwinding of helices is more prominent for the mutants with different extents at 500 K rather than at 400 K with respect to the native protein at a lower temperature [ Figure 9(a,b)]. Figure  9(a) shows that the helix percentage gets significantly reduced for the mutants like E196K/Q and E200Q after few nano seconds of their respective simulations. Figure 9(b) represents a huge reduction of the percentage of helices (even to 0) for Q160R, E196Q, E200Q, R208K/Q and E219Q after few ns of their respective simulations at 500 K [ Figure 9(b)]. Interestingly, there is an overall increase in the percentage of b-sheet for some of the variants like Q160R, E196Q, H187Y, V210I and E219Q even at 400 K [ Figure 9(c)]. As shown in Figure 9(d), during the simulations of the native and the mutants at 500 K, the b-sheet portion becomes almost double as observed for Q160R, H187Y, R208K/Q, E200Q and E219Q. Percentage of a 1, a 2, and a 3 helices individually decrease over time for different mutants at 500 K (data not shown). This is evident from the figures that not only the overall helical percentage reduces with time but each helical percentage decrease as well for the mutants compared to that of the native protein.

Monitoring specific hydrogen bonding interactions
The presence of an antiparallel b-sheet plays a crucial role regarding the conversion of native prion protein (PrP C ) into scrapie isoform (PrP Sc ). Both extension and reduction of b-sheet with respect to the native protein are reported as a sensitive event in this regard.  Figure S6(a-f) of supporting information show the variation of hydrogen bond distances within the above-mentioned donor-acceptor pair of WT HuPrP and series of its mutants at 400 K and 500 K. The hb 1 distance remains mostly is in the stable range for native and most of the mutant proteins except E219Q, Q212E at 400 K [ Figure S6(a)] and E219Q, E196K/Q with much greater extent at 500 K [ Figure S6(b)]. Figures S6(c,d) represent the evolution of hb 3 distances with time for the native and the mutant proteins at 400 K and 500 K, respectively. The hb 3 distances increase for E219Q, R208K at 400 K and E219Q, E196K/Q with a significant extent at 500 K [ Figure S6(c,d)]. Similarly, hb 4 distances vary slightly for R208K/Q, E200Q at 400 K whereas that vary appreciably for E219Q, E196K/Q, Q212E, E200Q and Y145W at 500 K compared to that of the native protein [ Figure S6(e,f)]. In general, the relative variation of inter b-strands distances of some of the mutants like E219Q, E196K/Q, R208K/Q, E200Q and Q212E compared to WT HuPrP indicates that the compactness of these mutant structures gets disrupted by the weakening of the stable hydrogen bonding interactions within b-strands.

Monitoring specific salt bridge (SB) interactions
This is noteworthy that even partial distortion of the original SB interactions present in the native protein is sufficient to destabilize the stable folded conformation of the protein (PrP C ) and thus helps to accelerate the transition into the disease-related isoform (PrP Sc ). Polar mutations mainly affect the native SB networks. Here we consider several important SBs located in a 1, a 3 helices or in b 2 -a 2, a 2 -a 3 loop regions.
The study includes two necessary SBs (SB2: between R156-E196 and SB3: between E146-K204), which helps to maintain the tertiary structure of the globular domain of the protein.
The SB2 is situated in the loop before b 2 and a 3 and SB3 resides on a 1 and a 3 helices. Apart from those SB1 (E146-R208) is located on a 1 and a 3 helices and SB4 (D202-R156) is located on a 3 helix and the loop before b 2 strand. Figure S7(a-h) represents the evolution of SB distances mentioned above as a function of time for WT HuPrP and the mutants at 400 K and 500 K. The SB 1 distances remain almost in the stable range for native and most of the mutant proteins except H187Y, V210I, E211Q and E196Q at 400 K [ Figure S7(a)] and with a relatively greater extent for E196Q, Q160R, R208Q and Q212E at 500 K [ Figure S7(b)]. The SB 2 distances increase for few of the mutant proteins like E196Q, V210I, R208Q and E200Q at 400 K [ Figure S7(c)] and for native protein as well as for mutants like E196Q, R208Q, Q212E and Y145W at 500 K [ Figure S7(d)].
Similarly, Figure S7(e,f) represents a time evolution of SB 3 distances for native as well as mutant proteins at 400 K and 500 K. The SB 3 distances remain stable except for few e.g. E196K, E211Q and R208K at 400 K and vary for a range of mutants e.g. E196Q, Q160R,V210I, Q212E and E200Q at 500 K. The SB 4 distances increase significantly for H187Y, E196K/Q and E200Q at 400 K [ Figure S7(g)] and mutants like E196Q, Y145W, E219Q, Q160R, Q212E and E200Q at 500 K [ Figure  S7(h)]. Our results suggest that SB interactions between a 1 and a 3 helices or loops before the b 2 strand and a 3 helix become weakened for the mutants like E196K/Q, E200Q, E219Q, R208K/Q, V210I, Q212E and Q160R. As evident by the dynamical behaviour of these mutants from their respective MD trajectories that a 1 helix starts to detach from the rest of the protein core after the first few ns of simulations for most of the mutants. The a 1 helix gets separated from the inner core region of the proteins in the due courses of simulations. As a result, the SB interactions between a 1 and a 3 helices or loops before the b 2 strand and a 3 helix decrease, which is consistent with the existing studies.

Results from SMD simulations
SMD simulations are used as an alternative approach to study the unfolding dynamics of the native and mutant proteins. Figure 10(a) represents current forces applied on the respective SMD atoms of the proteins vs time. This figure shows a sharp rise in the plot around 200 ps which indicates complete unfolding of the protein during that time range. Figure 10(b) shows the same plot for a lower time scale range. This is evident from this figure that the fist peak appears in the time range around 50 ps which indicates breaking of hydrogen bonds between the b-sheets. Figure  11 also confirms the same fact that shows different snapshots (0 ps, 50 ps and 100ps) of protein conformations of Wt HuPrP and two representative mutants E196Q and E211Q. The hydrogen bonds between the b-strands break during 50 ps and the strands get separated and a 1 helix detaches from the rest of the core. The unfolding dynamics observed by SMD simulations are similar with the dynamics observed with high temperature MD simulations. 2LFT (Biljan et al. 2012)). For a comparative analysis, we also perform similar kinds of all atoms MD simulations of those mutants with explicit solvent. For the analysis, we include the NMR structure of the mutant (Q212P) that shows distinctive secondary structural fluctuations in the a 3 helix region where it splits into two portions after residue 212 unlike that of the native protein. Alternatively, NMR structures of other mutants show minor structural perturbations that include changes in surface electrostatic potentials and solvent exposure of the hydrophobic core compared to that of the native protein.
We carry out extensive analyses of the MD trajectories generated from the NMR structures of the mutant like the native protein structure and model structures of the mutant proteins. We observe that E200K and E219K show stable structural dynamics and thus their structural properties remain almost close to that of the native protein. On the contrary, V210I and Q212P show relatively large structural flexibility and the related properties like C a -RMSD, solvent exposure of hydrophobic core, changes in secondary structures and R g compared to that of the respective values of WT HuPrP. Figure S10(a,b) of the supporting information demonstrates the variations in the dynamical properties of the mutants extracted from the MD simulations of NMR structures. The results obtained from the MD simulations of the NMR structures show quite a similarity in properties observed with the model structures of the mutants except for E219K. But MD simulation of E219Q mutant shows a relatively greater degree of structural flexibility. Thus this study indicates that the model structures well represent the mutant structures of the protein in general.

Comparison with previous study
As mentioned earlier in the introduction section, MD simulation results of Wt HuPrP have been compared with literature  and the potential energy value matches well with the literature. Experimental data or result does not exist for all the mutants that we studied here. Importantly, we introduced some unique mutants that are not studied before. NMR structure of few mutants under study exist in literature. We have also performed simulation of the NMR structures of those mutants along with their modeled structures for a comparative study and the results match well for both type of structures of the mutants. MD studies of few mutants like E196K, V210I and E211Q by others corroborate well with our simulation results. The researchers showed that the above mentioned mutants show maximum structural deviations for the regions e.g. b 2 -a 2 loop and end of a 2 and a 3 helices and simultaneous increase of the interior surface exposure of the associated regions that also supports our findings. Similar mutants (containing same mutational position but different side chain residue) e.g. H187R, E200K, R208H and Q212P studied by other research groups also exhibit similar structural dynamics as mentioned above.

Conclusion
This is well established that single point pathogenic mutations of prion protein help to influence the susceptibility towards developing prion diseases in humans and other species. A thorough structural comparison between the native and mutant proteins is crucial for understanding how a specific mutation can facilitate conversion of the cellular form (PrP C ) of the prion protein into the disease-related isoform (PrP Sc ). We followed the detailed structural dynamics of native as well as 12 different single point pathogenic mutants of HuPrP using different methodologies. The most important part of our study includes the identification of initial unfolding steps of prion protein as a result of a single point mutation at more or less conserved regions of its globular domain. Thus we can generalize the unfolding mechanism for a bunch of single-point mutants of HuPrP, among these majority include unique mutations that can be compared with respect to the native protein and each individual. This is evident from the unfolding simulations of the mutants studied here, involve primarily a movement of a 1 helix from the rest of the core that is formed by hydrophobic interactions within the residues located on a 2 and a 3 helices. This is because the hydrophilic residues on a 1 helix makes several important SB interactions with the residues on a 3 helix or the associated loop regions that not only gives a compact structure of the globular domain but also provides the stability of the overall structure. Our study reveals single point mutants whose mutating residues belong to the above-mentioned regions such as H187Y, E196K/Q, E200Q, R208K/Q, V210I, E211Q and Q212E that not only significantly affect the stability of the native protein but also enhance the exposure of the inner hydrophobic core for the same. Among these, the surface exposure is most prominent for Q212E.
We studied a wide range of mutants mainly situated on the C-terminal portion of HuPrP and found that initially hydrogen bonding interactions break between the b-sheets and the strands get separated and a 1 helix shows huge structural fluctuations in the course of the simulation and detaches from other two helices constituting the protein core. On the contrary, b-structures remain almost stable with few exceptions where the mutants show significant extension as well as the formation of new b-structures like Q160R, H187Y, E200Q and R208K/Q and relatively lower in E211Q, V210I and Q212E at 500 K. Thus we have identified mutants that demonstrate extension or generation of b-structures and or exposure of inner hydrophobic core simultaneously, so have more potential to form aggregated structures are Q160R, H187Y, E200Q, R208K/Q V210I and Q212E. Among these, the mutant Q212E draws special attention due to the maximum surface exposure of the hydrophobic core and thus the mutant is supposed to possess more potential to unfold and form aggregates at ease. On the contrary, the mutant E200Q possesses maximum extent of b-structure in the course of the simulation. So both the mutants mentioned above have higher potential of aggregation due to either higher b-content or higher surface exposure of the core. Thus from this study, we are able to identify some very important mutants of the human prion protein that are more aggregation-prone and thus related to the misfolding diseases and are unique in certain respects.
Here we observed a wide variety of dynamical fluctuations of different mutants with different patterns and compared the results directly with respect to the corresponding fluctuations of the native protein. This also entailed us to probe the effect of the specific mutation and its long-range contributions on the structure and dynamics of the native protein. Some mutant proteins represent a higher degree of dynamical fluctuations than others as evidenced from the time course of almost every dynamical properties such as C a RMSD, C a RMSF, R g, SASA of the hydrophobic core, specific inter-b hydrogen bonding distances, SB distances, etc. On the contrary, some other dynamical variables reduce with time such as the overall number of hydrogen bonds, the number of residues that reside on a helices, and so on. The number of residues that reside on b-sheet structures increase in some of the mutants which are not only due to extension but also due to the formation of new b-sheet structures as observed in proteins like Q160R, H187Y, E200Q, R208K/Q and little less in proteins like V210I and Q212E. It may be noted that though E219Q (similar to protective polymorphic variant) show huge structural flexibility but C a RMSD, R g and the surface exposure of the hydrophobic core remains relatively low compared to others. Thus this mutant shows a comparatively lower propensity to unfold or in other words more stable.
Moreover, PCA and conformational clustering analyses reaffirm that structural fluctuations are more prominent for some of the mutants with respect to others. Additionally, both the techniques indicate the presence of 2 or 3 different sets of protein conformations that might represent either native-like or different unfolding intermediates like structures of the mutants. Thus by using these techniques we can separate different types of protein conformations corresponding to the degree of their unfolding. Such different protein conformations can be easily extracted from the results of our analyses and further studied in detail. Some mutant proteins show 2 major sets of conformations whereas other proteins show another additional set of structures along with the similar sets of conformations observed for the proteins of the first set. It might be concluded that some mutants represent a lower extent of unfolding compared to others and thus show the presence of initially unfolded structures along with the native-like structures. Proteins belonging to other set represent a greater extent of unfolding and thus the presence of extensively unfolded set of conformations along with the native-like and partially unfolded intermediate-like conformations.