Homology modeling and molecular dynamics study on Schwanniomyces occidentalis alpha-amylase

With consumers growing increasingly aware of environmental issues, industries find enzymes as a reasonable alternative over physical conditions and chemical catalysts. Amylases are important hydrolase enzymes, which have been widely used in variety of industrial process such as pharmaceutical, food, and fermentation industries. Among amylases α-Amylase is in maximum demand due to its wide range of applications. The homology modeling study on Schwanniomyces occidentalis amylase (AMY1, UniProt identifier number: P19269) was performed by Modeller using Aspergillus oryzae (6TAA) as the template. The resulting structure was analyzed for validity and subjected to 14 ns of molecular dynamics (MD) simulation trough GROMACS. The validity of obtained model may represent that utilized OPLS force field is suitable for calcium-containing enzymes. DSSP secondary structure and contact map analysis represent the conservation of domain A TIM barrel feature together with calcium ion coordination sphere. Investigating the covariance matrix followed by principle component analyses for the first five eigenvectors of both trajectories indicate a little more flexibility for AMY1 structure. The electrostatic calculation for the final structures shows similar isoelectric point and superimposed buffering zone in the 5–8 pH range.

Enzymes are proteins capable of transforming their specific substrate into the product in shorter time and milder thermal conditions compared with chemical methods (Tawfik, 2014). Due to the fact that micro-organisms possess rapid growth, it is possible to clone and scale up the expression of enzymes for bioprocess. Alpha-amylases (EC 3.2.1.1), found in bacterial, fungal, plant, and animal sources, are responsible for catalyzing the hydrolysis of α-1, 4 glycosidic linkage of glycogen, starch, linear amylose, and branched amylopectin to generate smaller sugars. Alternatively known as 1, 4-α-D-glucan glucanohydrolase or glycogenase, they are generally calcium metalloenzymes (Kandra, 2003). This endoamylase plays an important role in starch processing industry and has been proved as a green substitute for hydrochloric or oxalic acid and high temperature conditions (pH 2 and 150°C) needed in traditional processing units (Vieille & Zeikus, 2001). With the advent of various novel strategies enzymes are now involved in the pharmaceutical, clinical diagnostics, food, and detergent industries (Feitkenhauer, 2003;Zeman & McCrea, 1985). Alpha-amylases play an outstanding role in all the above-mentioned applications, meanwhile considered as a valuable model enzyme for studying biophysical properties and thermal adaptation in proteins (Fitter, 2005;Prakash & Jaiswal, 2010).
Αlpha-amylase is made up of three domains; domain A forms the (β/α) 8 or TIM barrel motif with the catalitic acidic triads of Asp 217, Glu 241, and Asp 308. The protruding calcium-binding domain B (about 70 residues including Asn 132, Glu 173, Asp 186, Asp 217, His 221, and Glu 241) is located between the third beta strand and helix and the C-terminal Greek key domain (Da Lage, Feller, & Janecek, 2004;MacGregor, 1988). Although, protein structures related to several alpha-amylases are currently available in PDB database which shows the importance of this enzyme, there is no threedimensional (3D) structure of Schwanniomyces occidentalis alpha-amylase (Strasser et al., 1989). It is essential to have enough knowledge about 3D structure for understanding the biophysical properties, however obtaining X-ray diffraction quality crystals of flexible multi-domain monomeric protein targets is often difficult (Harris, 2003). The homology modeling is a reliable and an efficient method for the prediction of 3D structure of proteins (Krieger, Nabuurs, & Vriend, 2003;Šali & Blundell, 1993).
In the present study, the 3D feature of AMY1 was obtained by a homology modeling procedure based on the crystal structure of alpha-amylase from Aspergillus Oryzae (TAKA), deposited in protein data bank (PDB code 6TAA with 55% identity with the target sequence) and 2.10 Å resolution (Swift et al., 1991). The models were generated and a stable model was selected with regard to discrete optimized protein energy (DOPE) (Šali & Blundell, 1993;Shen & Sali, 2006). The generated 3D structure was subjected to molecular dynamics (MD) simulations (GROMACS 4.6.7) (Berendsen, van der Spoel, & van Drunen, 1995;Hess, Kutzner, Van Der Spoel, & Lindahl, 2008).
The modeled structure validity was tested through Procheck, verified 3D and packing quality methods and GROMACS trajectory tools (RMSD, R g and RMSF). Secondary structure, contact map and Principle component analysis were performed for both structures. Subsequently, homology-modeled AMY1 was used for comparing the conservation of structural features and calcium coordination sphere.

Homology modeling and structural validation
The homology model was used to build the initial model of AMY1. The first step was searching a number of related sequences to find a related protein as a template.
The sequences of alpha-amylase isolated from the S. occidentalis (AMY1) were retrieved from UniProtKB (http://www.uniprot.org/) that found in 1991, under identifier number of P19269 which consist 512 amino acids (1-25 are signal peptide and 487 residues, 26-512 build the main chain) (Wu, Wang, & Hsu, 1991). The modeled sequence in comparison with the presented UniProt one is a natural variant and possesses A and S instead of D and L in positions 350 and 479, respectively (in strains: CCRC 21164 and ATCC 26077/CBS 2863, i.e. it is due to the few variations in the different strains). An experimentally resolved 3D structure for AMY1 is not available. In order to find the most similar sequence with known 3D structure (template) to each of the target sequence, sequence similarity search was performed on the BLASTp online server (http://blast.ncbi.nlm.nih.gov) (Altschul et al., 1997). The 3D structures of the template, 6TAA were extracted from the Protein Data Bank (PDB) (http://www.rcsb.org/) (Bernstein et al., 1977). The automated sequence alignment and analysis of the template and target were carried out by EMBOSS Needleman-Wunsch method (Rice, Longden, & Bleasby, 2000;Needleman & Wunsch, 1970). This tool reports residues that are identical (percentage of identity) and conserved (percentage of similarity/positive). Modeller 9.14 program was performed to build the 3D structure of AMY1. Modeller is an implementation of an automated approach to comparative modeling by satisfaction of spatial restraints (Eswar et al., 2007;Fiser & Šali, 2003). The predicted structures (100 models) were saved in the PDB format and sorted according to scores calculated from DOPE. The best model was selected with regard to DOPE. The quality of predicted structure was confirmed by PROCHECK (Colovos & Yeates, 1993;Laskowski, MacArthur, Moss, & Thornton, 1993) and ProSA (calculates an overall quality score for 3D structures and shows whether the Z-score of the query input structure lies within the range of scores found for experimentally determined structures of similar size (Wiederstein & Sippl, 2007)). Normality of the local environment of amino acids was checked by WHAT IF coarse packing quality control, which should stay above −5 (Vriend & Sander, 1993). VERIFY-3D method produces average data points for each residue to evaluate the structural quality based on the energetic and empirical methods. Protein structure with more than 80% residues with score of >.2 is considered to be of high quality (Eisenberg, Lüthy, & Bowie, 1997).

Molecular dynamic simulation
Energy minimization and MD were performed through MD simulations by GROMACS (4.6.7) and Optimized Potential for Liquid Simulations (OPLS) force field (Jorgensen, Maxwell, & Tirado-Rives, 1996). The obtained structure from modeller was used as an input. Water box was created with at least 10 Å distances from protein using SPC water model and applying boundary conditions. System neutralization was done by adding sodium ions. The MD simulation was carried out to examine the quality of the model structures by checking their stability via performing 50 ps simulations at a constant temperature 300 K (NVT). Energy minimization was performed in 2000 steps to avoid any kind of bad contacts generated while solvating the system. Then NPT optimization was done for 500 ps. To increase the likelihood of achieving the appropriate structure, the MD simulation was performed for 14 ns using OPLS force field (Jorgensen et al., 1996). GROMACS implemented tools carried out for trajectory analysis. The RMSD and RMSF were calculated for the protein backbone using least-squares fitting. The protein radius of gyration for template and model structures was also obtained. DSSP V 2.1.0 used to follow changes in secondary structure during MD (Joosten et al., 2011). The principle component analyses were performed and the first five normal modes obtained.

Hardware and software
The homology modeling (MODELLER 9v14) and visualization of AMY1 and 6TAA structures (VMD 1.9.2) were carried out with MAC 10.10.1 environment. The MD simulations were performed using GROMACS 4.6.7 in Centos environment. For high duty jobs, one node of a cluster consisting 12 cores (1.4 GHz) AMD CPU with 32 GB RAM were used. The structure validation analysis using WHATIF (http://swift.cmbi.ru.nl/servers/html/in dex.html), VERIFY-3D (http://nihserver.mbi.ucla.edu/ SAVS/), and ProSA (https://prosa.services.came.sbg.ac. at/prosa.php) were carried out online. Contact map analysis was performed on the average structures obtained by GROMACS tools. The contact maps were visualized for carbon alpha (cut off = 1.5 nm) and compared utilizing CMView software freely available by Vehlow et al. (Vehlow et al., 2011). The obtained structure could also be further refined using the fully automated i3Drefine software as the final step in computational structure prediction pipeline (Bhattacharya & Cheng, 2013).

Homology modeling and validation
An optimal sequence alignment is essential to the success of homology modeling. In this study, homology modeling technique was used to construct models of AMY1 based on the appropriate identity among the structures of AMY1 and 6TAA. The homology modeling approach was used to construct the structure of AMY1. The homology model of AMY1 showed root mean square deviation value of 2.1 Å with the template. The sequence alignment used for homology modeling annotated by structural elements is shown in Figure 1. It seems that the two models (AMY1 and 6TAA) have a very similar tertiary structure. The TIM barrel fold β-strands as the key features of alpha-amylase can be observed to be well conserved and superimposed in both structures. Schellman motif amino acids capping the alpha helix c-terminal of the A domain, (Leu 118, His 119, Met 123, Figure 1) are also conserved in both structures. It should be noted that Gly 122 is replaced with Ser in AMY1, which could be a candidate for point mutation studies. However there are some little differences which well demonstrate the criteria declaring homologous sequences could most probably share same structural features in order to provide necessary functional conformation which is more conserved than sequence alone. Figure 1 shows that the alignment results indicate 55% sequence identity between 6TAA and AMY1. MODELLER software applies an inbuilt evaluation score called DOPE to assess the quality of generated structures. The modeled structure with a lowest DOPE score of −55783.08 was selected for validations and simulation studies. PROCHECK was used to evaluate the stereochemical properties of the model (Ramachandran plot, Supplementary Figure 1). In this study, PROCHECK showed 94.2% of amino acids in the core and 4.9% are in allowed regions as presented in Supplementary Figure 2(A). ProSA analysis showed small difference in Z-score of template and model structure as depicted in Supplementary Figure 2(B). VERIFY-3D score per residue were in the average score of >.2 for 90% of residues. Figure 2(A) illustrates the VERIFY-3D score for the modeled and template structures. VERIFY-3D determines the compatibility of an atomic model (3D) with its own amino acid sequence (1D) by assigning a structural class based on its location and environment (alpha, beta, loop, polar, nonpolar, etc.) and comparing the results to good structures (Lüthy, Bowie, & Eisenberg, 1992). As it can be inferred, there are three regions where the score is lower than .2 which are residues 163-173, 312-314, and 384-395 (overlay 26 from 471 which are nearly 5.45% of residues) and therefore the obtained structure meets the criteria for a good model, which states that at least 80% of residues should have VERIFIED-3D score larger than .2. In addition, packing quality obtained using WHAT-IF evaluation criteria is demonstrated in Figure 5, the packing score for nearly all residues were higher than the least score of −5 for predicted structure. Thus, validation methods indicate that the model was acceptable and was chosen for subsequent MD simulation.

MD simulations
The RMSD of protein backbone was followed for 14 ns (Figure 3(A)) After the initial rise, the system reached to a stable state and variations are reduced less than .04 nm in the last ns, indicating a good overall structure alignment with 6TAA. Another structural future of protein MD trajectory is the ability to follow the protein radius (R g ) in curse of simulation time (Figure 3(B)). The R g for predicted structure starts with a little higher value while for the template the protein radius shows a gradual increase in the first ns of simulation. However, as the corresponding data show both structures equilibrate around similar radii. The DSSP secondary structure analysis was performed on the trajectory file in course of simulation time as shown in Figure 4. The average numbers of amino acids present in each secondary structure are depicted with corresponding standard deviations in Figure 5. The number of coil forming residues only seems to be meaningfully different which can represent higher flexibility in AMY1 structure.
The 3D structure of AMY1 was superimposed with 6TAA as shown in Figure 6 It is well demonstrated that the presence of chargecharge interactions could stabilize folded protein structure, however the role of these interactions during folding and in unfolded state are neglected in most computational methods which focus only on the folded or nearly folded states (Azia & Levy, 2009). Both structures posses same number of His (7), Met (9), Tyr (34), and aspartic acid (Asp, 42) while AMY1 possesses 17 glutamic acid (Glu) compared to 12 for 6TAA as it can be inferred from Figure 1.
As mentioned, both structures possess seven His residues where six pairs are well matched in the alignment, five pairs are well conserved, and are in proximity of CA ions. The unmatched His residues are presented in the surface and are in contact with water molecules in both structures (Figures 7-9).
Contact map based on global alignment is represented in figure. The common contacts are in majority and there are minor variations in both structures, however the contact maps associated with alpha helix and beta sheet secondary structures are well superimposed. Contact map analysis is helpful in investigating structural personalities of proteins while providing information about the separation of contacting residues, which cannot be inferred from the 3D images alone. The difference distance map visualizes conformational changes between two structures. For a pair of residues (i, j) the distance map visualizes how much the distance between i and j has changed from 6TAA structure to AMY1. This highlights regions of conformational change as red hotspots, while regions that remain unchanged are colored in blue. An arrow and the circle show the Gly 35 and Asp 157, respectively.
The amount of available hydrophilic and hydrophobic surfaces of AMY1 is shown in Figure 9(A). The proportion of the hydrophilic surface (128 ± 2 nm 2 ) to the hydrophobic surface (104 ± 1.7 nm 2 ) in AMY1 is about 55 and 45%, respectively. Small variations in this curve also show that the time of simulation was enough, because the variations are low after the first few ns.
The modeled structure isoelectric point and enzyme net charge (electron charge) in neutral conditions were calculated to be 4 and −23, and for the template was 3.8 and −19, respectively. This difference in net charge can well be inferred from Figure 9(B) .The graph also illustrates how the charge of a protein is buffered over the range of pH 5-8, which also represents the pH range of template and AMY1 amylase activity. The final structure and the amount of protein surface remained nearly constant. The predicted 3D structure of the S. occidentalis alpha-amylase was submitted to the Protein Model Database (PMDB, bioinformatics.cineca.it/ PMDB/) and was assigned the accession number PM0080148.

Discussion
In this study, we predicted 3D structure of a fungal alpha-amylase that possesses 55% sequence identity (70% similarity) with the template protein sequence TAKA amylase (6TAA). The advantage of using fungi for the production of amylases is ease of manipulation in addition with economical bulk production and application for bioconversion of solid substrates due to hyphal mode of growth (solid-state fermentation, SSF), tolerance to high osmotic pressure, and low water activity (a w ) (Hernández, Rodríguez, Guerra, & Rosés, 2006). The fungal amylases are also generally regarded as safe in the food industry. Fungal a-amylases have been permitted as bread additives since 1955 in the US, which shows its preference over microbial sources (Gupta, Gigras, Mohapatra, Goswami, & Chauhan, 2003).
The protein sequence alignment with template is needed because Modeller can discover structural information based on sequence. This information is applied to simulation of target structure (Shenoy, 2010). Homology modeling, referred to as comparative or knowledge-based modeling, develops a 3D model from a protein sequence based on the structures of homologous proteins (Kaczanowski & Zielenkiewicz, 2010). Here, several structure validation tools were applied in order to qualify the predicted model. Ramachandran plot related to protein structure validation is based on the angles of backbone's covalent bonds (Chen et al., 2010). In this case, Ramachandran plot indicated that nearly all amino acids are in allowed regions. The ProSA structure quality method showed small difference in Z-score between AMY1 and 6TAA, which indicates significant structural closeness between model and template. Other validation methods such as WHATIF packing quality (most residues were >−5) and VERIFY-3D confirmed the validity of the predicted structure in comparison with template structure. The structures with verified 3D score of larger than .2 for 80% of residues are considered acceptable. The VERIFY-3D score for refined model were improved 5% as it can be inferred from Figure 2(A).
From the MD view point, variations were comparable and converged along simulation time especially in case of R g . Although the amino acid compositions of both structures are 45% different (Figure 1), RMSD and RMSF scores show similar patterns. The secondary structure analysis over the trajectory (Figure 4) is also very similar considering the fact that 6TAA structure lacks 21 N-terminal residues. The average numbers of amino acids present in each secondary structure are represented in Figure 5. The comparison of two RMSF data shows, in average, higher fluctuations in AMY1, especially in the c-terminal domain. PCA or essential  As it can be inferred from alignment file, RMSF and contact map, the domain A is much more conserved than the other two domains which seems to be a pattern in alpha-amylase family. The sequence and structural analysis of calcium ion coordination sphere also show the conserved microenvironments, especially for the distal calcium ion, which is surrounded by residues from the B domain.
The presence of surface charge-charge interaction may increase the folding rate and hence protein stability as discussed recently (Tripathi, Garcìa, & Makhatadze, 2015;Tzul, Schweiker, & Makhatadze, 2015). The surface properties of AMY1 is an important MD approach and shows that the change in the content of hydrophilic and hydrophobic surfaces were in equilibrium state from the first ns of trajectory and remained stable for the predicted structure. From different point of view, the decreases in the number of ionic interactions cause greater flexibility in alpha-amylase family and adopt them for activity in lower temperatures (Hiteshi & Gupta, 2014). AMY1 remains active in acidic conditions (pH 4) while it only keeps 20% of the original activity at slightly alkaline conditions of pH 8. The optimum pH for TAKA amylase is reported to be around 5.3 which is close to 6 for AMY1 (Takahasgi, Toda, Nishibe, & Yamamoto, 1982).
Comparison of thermal stability shows that AMY1 (expressed in Saccharomyces cerevisiae) optimum temperature and pH had been reported to be 40°C (Wang, Lin, & Hsu, 1989). The AMY1 lost its activity completely at 70°C. Similarly, the catalytic activity has been showed to be maximal at 40°C, for Taka-amylase (Wang, Gu, Cui, & Liu, 2011).
The residues involved in interacting with calcium are well conserved in both structures. Therefore, the effect of calcium ion on the stability of taka amylase can also be expected for AMY. Taka-amylase A, and the liquefying amylase of Bacillus subtilis are sensitive to EDTA treatment.
The 6TAA melting temperature (T m ) has been determined to be 71°C in calcium saturation (CS) and 57°C in calcium depleted (CD) conditions. It is interesting to note that the stability of B. licheniformis (BLA, 1BLI, 483 residues and 58.27 Da) with 3 Ca ions is 102°C in CS condition and 51°C for CD situation. Therefore the melting temperature of TAKA amylase is nearly 6°C higher in CD condition. B. amyloliquefaciens (BAA) T m is also much lower, 38°C in CD conditions (Duy & Fitter, 2005). Therefore studying the TAKA and AMY1 amylase structures could be helpful in designing thermostable amylase which could be active under calcium depletion conditions. It is interesting to note that, Declerck et al. have also build a homology model of BLA based on the Xray structure of taka-amylase A from A. oryzae (Matsuura, Kusunoki, Harada, & Kakudo, 1984). Although they conclude that no extensive interpretation could be done because of the low sequence similarity between BLA and taka-amylase (about 25%) (Declerck, Joyet, Trosset, Garnier, & Gaillardin, 1995), later comparison with the determined X-ray structure of BLA reveals that the overall conformation of the protein had been properly modeled (Declerck et al., 1997). Site-directed mutagenesis of taka-amylase A catalytic active-site residues have been also studied before, consisting on the importance of homology modeling in elucidating structural features of amylase enzymes (Nagashima et al., 1992).
The validity of obtained model may represent that the utilized OPLS force field is suitable for calcium-containing enzymes. Despite the powerful background and high performance MD calculations, the starting structure and the time course of simulation might affect the resulting structure. Therefore, the major structural characteristics with regard to performed analyses were discussed here.

Conclusion
Fungal amylase has been shown to have advantages in terms of safety and solid-state applications over other microbial sources. Taka amylase is also indicated to have high thermal stability under calcium depletion conditions. The 3D structure of AMY1 has not been reported yet. Production and optimization of activity are required to understand the relationship between enzyme structure and sequence. In this investigation, the 3D structure of the Schwanniomyces alpha-amylase was built by homology modeling. Then energy minimization and MD were used to provide an insight regarding the structural and flexibility of both model and template. The obtained 3D structure can be considered reliable not only due to the computational methods but because of the conserved structural features associated with amylase enzymes. Usefulness of homology modeling method could be inferred from comparisons with the template. The 3D structure of AMY1, as many other industrial enzymes, may facilitate protein engineering studies.

Supplementary material
The supplementary material for this paper is available online at http://dx.doi.org/10.1080/07391102.2016. 1154892.