How do the mutations in PfK13 protein promote anti-malarial drug resistance?

Abstract Plasmodium falciparum develops resistance to artemisinin upon exposure to the anti-malarial drug. Various mutations in the Plasmodium falciparum Kelch13 (PfK13) protein such as Y493H, R539T, I543T and C580Y have been associated with anti-malarial drug resistance. These mutations impede the regular ubiquitination process that eventually invokes drug resistance. However, the relationship between the mutation and the mechanism of drug resistance has not yet been fully elucidated. The comparative protein dynamics are studied by performing the classical molecular dynamics (MD) simulations and subsequent analysis of the trajectories adopting root-mean-square fluctuations, the secondary-structure predictions and the dynamical cross-correlation matrix analysis tools. Here, we observed that the mutations in the Kelch-domain do not have any structural impact on the mutated site; however, it significantly alters the overall dynamics of the protein. The loop-region of the BTB-domain especially for Y493H and C580Y mutants is found to have the enhanced dynamical fluctuations. The enhanced fluctuations in the BTB-domain could affect the protein–protein (PfK13-Cullin) binding interactions in the ubiquitination process and eventually lead to anti-malarial drug resistance. Communicated by Ramaswamy H. Sarma


Introduction
Artemisinin and its derivatives are the most effective and widely used treatment for Plasmodium falciparum malaria. The emergence and spread of drug resistance in the malaria parasite is a severe public health concern all over the malaria-endemic countries (Dondorp et al., 2011;Fairhurst et al., 2012;Turschner & Efferth, 2009). Artemisinin resistance is defined as delayed parasite clearance following treatment with artemisinin mono-therapy or with artemisinin-based combination therapy Dhorda et al., 2021;Fairhurst & Dondorp, 2016). Early signs of delayed response to artemisinin in P. falciparum parasite were first reported in 2009 in northwest Cambodia (Ashley et al., 2014;Dondorp et al., 2009). This phenotype has been now reported in Thailand (Ashley et al., 2014;Phyo et al., 2012), Vietnam (Hien et al., 2012;Thriemer et al., 2014), Myanmar (Ashley et al., 2014;Kyaw et al., 2013) and China (Huang et al., 2015). Resistance to artemisinin has also been reported in India (Das et al., 2018;Mishra et al., 2015), and is highly suspected in South America (Rosenthal, 2018) and East Africa (Roper et al., 2014;Talisuna et al., 2012).
Single-site mutations in the Plasmodium falciparum Kelch13 protein (PfK13) were associated with the artemisinin resistance Fairhurst, 2015;Straimer et al., 2015). Till now, 124 different non-synonymous kelch mutations have been identified that are associated with the artemisinin resistance (Anderson et al., 2017;Dafalla et al., 2020;M enard et al., 2016). The frequent and most prevalent mutations detected in resistant populations are Y493H (tyrosine(Y) @493 is replaced by histidine(H)), R539T (arginine(R) @539 is replaced by threonine(T)), I543T (isoleucine(I) @543 is replaced by threonine(T)) and C580Y (cysteine(C) @580 is replaced by tyrosine(Y)), respectively (Ghorbal et al., 2014;M enard et al., 2016;Singh et al., 2016). The C580Y mutation occurs in about 55% of samples with any resistance mutation in Southeast Asia and is assumed to be one of the strong molecular markers for artemisinin resistance (Singh et al., 2016). The next most prevalent mutation is Y493H, which occurs in about 10% of samples with any resistance mutation in Southeast-Asia (Taylor et al., 2015).
The PfK13 protein contains 726 amino-acid residues consisting of non-conserved apicomplexa-specific N-terminal region (amino-acids 1-211), highly conserved coiled-coil-containing domain (amino-acids 212-341), well-conserved BTBdomain (amino-acids 350-442) and C-terminal Kelch-domain (amino-acids 443-726) as shown in Figure 1(b) (Copp ee et al., 2019;Haldar et al., 2018;MalariaGEN Plasmodium falciparum Community Project, 2016). There are only two crystal structures of PfK13 available in Protein Data Bank i.e., 4YY8 and 4ZGC. From the crystal structure in 4YY8, the protein is predicted as a monomer in biological assembly (Jiang et al., 2015a) and 4ZGC is shown to be dimer form (Jiang et al., 2015b). The N-terminal and CCC-domain are missing in these crystal structures. The crystal structures consist of Kelch-domain and BTB-domain that are associated with the artemisinin drug resistance. The Kelch-domain (propeller domain) has six kelch motifs and the mutations are present in this propeller domain (Figure 1(a)). The BTB-domain interacts with Cullin protein and gets involved in protein-protein interactions Geyer et al., 2003;Paloque et al., 2022;Pinkas et al., 2017;Zhang et al., 2004). However, the Kelch-domain acts as a receptor for substrate proteins to be ubiquitinated (Adams et al., 2000;Canning et al., 2013;Prag & Adams, 2003).
The proposed mechanism of artemisinin resistance is the proteostatic dysregulation of P. falciparum phosphatidylinositol-3-kinases (PfPI3K) enzyme (Mbengue et al., 2015). By binding to PfPI3K enzymes, PfK13 leads to its ubiquitination and degradation. Mutation in the PfK13 prevents this interaction, thus increasing the levels of PfPI3K enzyme and its product phosphatidylinositol 3-phosphate, subsequently leading to artemisinin resistance Suresh & Haldar, 2018).
A large number of PfK13 mutations have been associated with the artemisinin resistance Ghorbal et al., 2014;Straimer et al., 2015), however, why do these specific mutations instigate drug resistance is not known yet. The primary aim of this work is to understand the comparative structural dynamics of the wild-type and four mutated proteins (Y493H, R539T, I543T and C580Y) by adopting classical molecular dynamics simulations (Bhatt & Ali, 2021;Bisht et al., 2019;Kour et al., 2020;Rathee et al., 2022). The analysis has been carried out on monomer unit of PfK13 protein (PDB ID: 4YY8) that contains the Kelch-domain and BTB-domain.

System preparation
The classical molecular dynamics simulations were performed on the Wild-Type (WT) and four mutants of the PfKelch 13 protein. The initial coordinates were obtained from the reported crystal structure (PDB ID:4YY8) (Jiang et al., 2015a). All the four mutants were generated using the mutagenesis function of Pymol by substituting the residue of WT structure with the mutant residue (DeLano, 2002). AMBER FF14SB force-field was used to generate the force fields of WT and mutant structures (Maier et al., 2015). Missing hydrogens in the crystal structures were added using the Leap module of Amber 16 software. Prior to the system preparation, the pKa value of the individual amino-acid were calculated on the PDB2PQR server Version 1.6 (Dolinsky et al., 2007). We have used PROPKA to assign protonation states at normal physiological pH 7.4. The salt concentration of 0.036 M NaCl was used. Amber provides the protonation state for all titrating side chains of ASP, GLU, HIS, LYS, TYR and CYS and it also changes the histidine (HIS) to HIE (neutral, e-nitrogen protonated) depending upon the protonation state. The added counter ions for each system are mentioned in Supplementary Table S1.

Molecular dynamics simulations
Simulations were performed on the monomeric state of PfK13, using the NAMD 2.12 simulation package (Copp ee et al., 2019;Nelson et al., 1996). The TIP3P water model was used for solvating each system maintaining at least 15 Å of the separation between the solute and the edges of the box (Mark & Nilsson, 2001). The system was centered in the periodic box of dimensions of 90.53 Â 118.56 Â 84.13 Å 3 containing water molecules with added sodium and chloride ions. The number of water molecule for each system is mentioned in Supplementary Table S1. The Particle Mesh Ewald (PME) approach was employed for the long-range interactions and van der Waals interaction were evaluated using a cut-off value 10 Å (Darden et al., 1993). The temperature and pressure were controlled using Langevin algorithm at 300 K and 1 atm, respectively (Goga et al., 2012). The energy of each system was minimized by using the conjugate gradient algorithm with 250,000 steps to remove all the steric clashes and attain lowest possible energy. After energy minimization, the solvent equilibration was performed for 1 ns at 300 K at NVT ensemble. At each integration step, the velocities were reassigned from a new Maxwell distribution and the temperature was incremented by 0.0001K. The protein was slowly released by applying harmonic restraints with force constants of 99,25, 1.0, 0.1 and 0.001 kcal mol -1 Å -2 for 1.25ns. The SHAKE algorithm was applied to constrain all the bonds including hydrogen atoms (Ryckaert et al., 1977). A time step of 2 fs was used for integration the equations of motion. After removing the constraint force, 400 ns long NPT trajectory was produced. Out of which first 200 ns was not included in the analysis and considered as the equilibration step. We have realized that all the systems were completely equilibrated till 100 ns, however due to several internal structural changes, Y493H was not equilibrated. To make it consistent for all the cases the initial 200 ns was considered as the equilibration step and subsequently 200 ns used for the structural analysis.

Trajectory analysis
The simulated trajectories of all systems were analyzed to calculate various stability parameters involving root mean square deviation (RMSD), root mean square fluctuation (RMSF), radius of gyration (R g ) and hydrogen-bond interactions. Visual Molecular Dynamics (VMD 1.9.1) software was utilized for visualization and analysis of the trajectories . Bio3D package version 3.1.0 was used to perform the Dynamics cross-correlations matrix (DCCM) and Principal components analysis (PCA) (Grant et al., 2006).

Results and discussion
To understand the protein dynamics of WT and mutant structures we have performed the comparative analysis of the simulated trajectories utilizing the various computational tools. The details are discussed in the following sections. The secondary structure predictions and DCCM analysis have been discussed in the subsequent subsections.

Comparison of protein dynamics of WT and mutants
To obtain the comparative protein dynamics, the RMSD analysis have been performed on the simulated individual five trajectories (one WT and four mutant-type). The protein heavy atoms (Ca, C, O, N) excluding the hydrogen atoms were considered for the calculations of the RMSD values. The domain-wise analysis (Kelch-domain, BTB-domain and Loop-region) has been performed and compared with the complete-protein. The specific region based on the amino-acids residues were chosen to plot the RMSD graph for the different domains, however for the complete protein the entire region (amino-acid residues 350-726) is considered (Figure 1). The RMSD plots for all the five trajectories are given in Figure 2. In all the mutants, the mutation sites are located within the Kelch-domain only. However, the comparison of the RMSDs for the Kelch-domain does not indicate any significant change in the RMSD fluctuations. It is expected to observe significantly different fluctuations in the mutated regions in comparison to the WT. The RMSD fluctuations for the Kelch-domain are quite small throughout the simulation for all the cases and the average value of RMSD ranges from 0.88 Å to 1.10 Å. For the Y493H it is 0.88 Å. This indicates that Y493H has significantly less fluctuations for the Kelch-domain compared to WT and other mutant proteins (Figure 2).
Further investigations reveal that a non-covalent electrostatic interaction between the His493 and Glu426 induces compactness in the Kelch-domain in Y493H. We also compared the pore diameter of Kelch-domain to further investigate the compactness of the protein (Supplementary Figure  S2). It revealed that Y493H indeed possesses the smallest pore of 9.8 Å compared to the average value of 10.3 Å for all other systems. Furthermore, a quite large fluctuations in the BTB-domain, especially at the N-terminal region is observed for all the cases. The average value of the RMSD fluctuations for heavy atoms in the BTB-domain for all the cases ranges from 1.86 Å to 3.14 Å. The average RMSD value for the WT is 1.94 Å however, it is 1.86 Å for the R539T which is the lowest value among all the mutants Figure 3.
In Y493H, we have observed quite a rich feature in the RMSD plot due to conformation change in the protein. The average RMSD value of 1.45 Å is observed in the corresponding trajectory up to 85 ns, and then the switching of the confirmation occurs. Subsequently, other conformational changes are observed at 155 ns Figure 2b.
The primary reason behind these confirmation switches is the change in the protein's secondary structure, which has been described in Section 3.3. The RMSD value increases when going from the turns to coils. The signature of the conformation change is captured in the probability distribution plot as well (see Figure 4c). Three different peaks are observed for Y493H with the RMSD values of 1.45, 2.46 and 3.33 Å and each peak corresponds to the three different conformations.
In C580Y, the different scenarios have been observed in the RMSD plot, where the large fluctuations are observed for the complete-protein rather than the BTB-domain Figure 2e. The average RMSD value of the complete-protein is 2.55 Å Figure 3. The probability distribution of RMSD also indicates the broader peak for the complete protein Figure 4a. It reveals that the higher fluctuation in this particular mutant is due to the loss of hydrogen-bond interaction between Cys580-Gly533 residues (Supplementary Figure S4). No large fluctuations are observed in the R539T and I543T mutants.
It has been observed that among four mutants, there are larger fluctuations in the dynamics of the BTB-domain of Y493H and C580Y as compared to the WT. Even though both of the mutants have larger fluctuations in the BTBdomain than the WT, the microscopic details of the dynamics of C580Y and Y493H are completely different from each other. It has been observed that the mutation in the Kelch domain does not have any local impact but it has a large impact on the overall structure and dynamics of the BTBdomain as well as on the loop-region of the BTB-domain.
We further analyzed the effect of mutations on the compactness of the protein by calculating the radius of gyration (R g ). The average R g value in between the range of 22.18-23.12 Å is obtained for all the cases (Supplementary Figure S3). The average R g value for the Y493H is calculated as 22.18 Å. The decrease in R g value for Y493H is responsible for the overall compactness in this particular protein.
It has been further revealed that a non-covalent interdomain hydrogen-interaction between the residue His493 and Glu426 makes the structure more compact Figure 5a. The probability distribution curve indicates that the average value for the interaction is 3.38 Å which is almost more inclined to the formation of the hydrogen bond Figure 5b. No similar interaction pattern is found in any other mutant and is also absent in WT Figure 5d.

Comparison of individual residue fluctuation
The fluctuations of individual amino-acid residues are responsible for the overall dynamics of the protein. To quantify these fluctuations, the RMSF analysis is performed along the simulated trajectories. The protein Ca atoms were considered for the calculations of the RMSF values. It is observed that in all the studied proteins the residues in the BTBdomain (350-442 residues) fluctuates significantly more compared to the Kelch-domain (443-726 residues) as shown in Figure 6. However, larger fluctuations are observed for the Y493H and C580Y compared to the other mutants and WT. It is also observed that the RMS fluctuations in the WT are the lowest compared to all the mutants, except R539T which shows similar dynamics as the WT.
The large fluctuations in RMSF for Y493H and C580Y are primarily arising from 370 to 390 amino-acid residues that constitute the loop-regions of the BTB-domain (shown in inset box). Thus, it is evident that the loop-regions of the two particular mutants dominate the protein dynamics. We realized this from the proteins RMSD analysis as well.
Comparing the protein dynamics between Y493H and C580Y, it have been realized that even though both of them possess larger fluctuations in the loop regions, but the microscopic origin of the protein dynamics is completely different from each other. In C580Y, Cys580 residue loses a hydrogen bond interaction with Gly533 (Supplementary Figure S4), while in the case of Y493H the additional hydrogen bond between His493 and Glu426 compacts the protein but with the cost of enhancing the fluctuations of 370-390 residues. We observed that the mutation affects the atomistic motion of the residues that are far from the mutated site.

Secondary structure analysis
The secondary structure is a crucial component to study the structural behavior of the protein. We have performed the secondary structure analysis for all the cases and compared  them with the WT. The secondary structure content is calculated only for the BTB-domain (350-442 residues) as no structural deviation was observed for the Kelch-domain.
The graphical representations of the secondary structures for all the cases are shown in Figure 7. It is observed that the loop-region, especially the 367-379 amino-acid residues of the BTB-domain shows the major difference in the structure, mainly in the Y493H. However, no large conformational differences are observed in between the 380 and 442 amino-acid residues except for Y493H, where the trend of secondary structure is changed around 416-421 amino-acid residues.
In WT, the secondary structure content of loop-region are primarily the mixture of turns and 3 10 -helix especially from initial to 110 ns of the trajectory, afterward this pattern changes to a mixture of turns and a-helix. To understand the conformational change in the loop-region the comparisons between the protein structures are performed. The structures obtained at different snapshots in the trajectory are superimposed to quantify the movement of loop-region. To do so, the snapshots at every 40 ns interval are taken from 0 to 200 ns and superimposed them together. A quite nice superimposition of the structures is observed only in the Kelchdomain not with the BTB-domain due to structural flexibility of the loop-region. The average change of positions for Gln391 (highest fluctuating residue in the loop-region) is approximately 10 Å (see Supplementary Figure S5).
In Y493H, three different conformations are obtained for the secondary structure. It forms turn and mixture of a-helix/3 10 -helix from 0 to 85ns, afterward turn and 3 10 -helix dominate up to 155 ns. Then up to 200 ns it is dominated by 3 10 -helix and coils. The residues (417-422) that are close to the hydrogen-bond interaction region (His493-Glu426) is completely dominated by 3 10 -helix, however, it forms turns in the WT. By superimposing the Y493H protein structures, the average distance obtained between the first and last structure is 29.92 Å which is larger than the 10 Å in WT (Supplementary Figure S5). The secondary structure change in the R539T is similar to the WT.  In C580y, the secondary structure consists of turns and a-helix from 0 to 83 ns, afterward, it is dominated by turns and a mixture of a-helix and 3 10 -helix. Comparing the average distance between the loop-region, we obtained the value of 11.94, 14.18 and 23.08 Å for the R539T, I543T and C580Y mutants, respectively.
We observed that there is large change in the overall secondary structure of Y493H. Three different conformations are observed in the loop-region of the BTB-domain. The loopregion is dominated by the turn while it changes to the coil as the simulation proceeds. The residues 417-422 are completely dominated by the 3 10 -helix in Y493H. This pattern of secondary structure is not observed in any other mutants. It indicates that mutation in the Y493H has large impact on the secondary structure as compared to the other mutants.

Dynamic cross-correlation matrix
It has been observed that a single mutation in the Kelchdomain can incite significant dynamical fluctuations in the BTB-domain but it is not clear how the fluctuations of the individual residues are correlated with each other especially those residing far apart. A proper understanding of the correlated motions can shed light on the long-range interaction patterns and eventually at the allosteric behaviors (Bhatt & Ali, 2021;Kasahara et al., 2014;Zhang et al., 2018).
To harness the correlated behaviors of the residues, the DCCMs analyses have been performed by computing all the Pearson cross-correlation coefficients for the backbone Ca atoms of all the residues. The DCCM maps for the WT, Y493H and C580Y are graphically presented in Figure 8 as we have observed the fluctuations in these mutants. However, the DCCM map for the other two mutants is shown in Supplementary Figure S7. The correlation matrix for the complete protein is shown in Figure 8a and zoomed view of the BTB-domain is shown in Figure 8b. The cyan and magenta colors depict the correlation and anti-correlation respectively, whereas white space depicts non-correlation dynamical behaviors between the residues.
For the BTB-domain, a very interesting pattern in between the residues 350 and 370 have been found Figure 8b. The correlation behavior in this region is nearly similar for all the cases. The two positive correlations are crossing diagonally each other which is a rare phenomenon observed for this particular region. Correlation is found between residues 350-355 and 360-370 however, the uncorrelation is observed among the residues 355-360. This is the region that forms the turns in the secondary structure. The residues 370-390 (red-dotted region) have anti-correlation in the WT however, in the Y493H and C580Y the anti-correlation behavior is getting disrupted and uncorrelated motions in between the residues is dominated.
Furthermore, we look into the complete-protein region of the WT and compare it with the mutant proteins ( Figure  9(a)). In the WT, the strong correlation between the protein residues 410-530 and 560-660 is observed as shown in box c and b respectively. The strong anti-correlation behavior is observed for the residues 565-680 (shown in box a and d). This pattern of correlation is completely disturbed for the mutant cases. It is quite evident that the mutation weakens the residue-residue interaction which is incline with the previous studies (  . The dynamics cross-correlation maps (DCCMs) for the WT, Y493H and C580Y mutants obtained from the entire production dynamics. The black dotted region (residues 350-442) indicates the BTB-domain and the zoomed view is shown in (b). Color Code: cyan color bar belongs to the (positive) correlation, magenta is for anti-correlation and white depicts uncorrelation. In the WT there is correlation within the residue however, significantly increased anti-correlation or uncorrelation in the mutants. 2018). The presence of strong cross-correlation motions in WT protein indicates strong interaction between the residues. However, the decreased correlated residue motions in the mutants hinted toward the loss of contact among the residues. It further indicates that the mutations affect the long-range dynamical correlation and affect the overall protein dynamics of the residues

Conclusions and outlook
The classical molecular dynamics simulation has been performed on PfK13 protein that has been associated with the ubiquitination process and anti-malarial drug resistance. The protein dynamics of the WT and four mutants i.e. Y493H, R539T, I543T and C580Y have been performed and compared with each other. Despite being in the mutated site, structural dynamics of Kelch-domain remain almost same for all the cases as of WT. However, we have realized that mutation in the Kelch-domain instigates larger structural fluctuation in the BTB-domain. Furthermore, it has been noticed that specifically, the loop region of the BTB domain is the primary reason for the large fluctuation.
Although the simulations have been performed in same environment for all the mutants, the largest deviations in the dynamics occur only for the Y493H and C580Y. The non-covalent electrostatic His493-Glu426 interaction, as well as the secondary structure change, is also responsible for the large dynamical fluctuations in Y493H. However, the absence of non-covalent intradomain interaction (Cys580-Gly533) causes the higher fluctuations in C580Y. The mutations also change the dynamical cross-correlation matrix pattern of the residues and break the interdomain correlation in the proteins. In a nutshell, the mutations in Kelch-domain enhances the structural fluctuations in the BTB-domain that binds with Cullin protein. Could the large fluctuations of the BTB-domain of the mutants be the reason for anti-malarial drug resistance?
Probably yes, as the BTB-domain associates with the Cullin protein to regulate the ubiquitination of the substrate protein, the protein-protein interaction get hampered with the enhanced dynamical fluctuations in the protein and was proven earlier in other instances (Acosta-Tapia et al., 2020;Keskin et al., 2008;Papaleo et al., 2016;Peters et al., 2005;Wang et al., 2019). This implies that disruption in the binding interaction will probably hinder the process of ubiquitination and cause the drug resistance in the mutant protein. This study thus opens up a new dimension that protein-protein interactions between Cullin-BTB-domain might be playing the pivotal role for the drug-resistance. However, dedicated in silico and in vitro studies with complete proteins are required to harness the complete understanding.