A comparative modeling and molecular docking study on Mycobacterium tuberculosis targets involved in peptidoglycan biosynthesis

An alarming rise of multidrug-resistant Mycobacterium tuberculosis strains and the continuous high global morbidity of tuberculosis have reinvigorated the need to identify novel targets to combat the disease. The enzymes that catalyze the biosynthesis of peptidoglycan in M. tuberculosis are essential and noteworthy therapeutic targets. In this study, the biochemical function and homology modeling of MurI, MurG, MraY, DapE, DapA, Alr, and Ddl enzymes of the CDC1551 M. tuberculosis strain involved in the biosynthesis of peptidoglycan cell wall are reported. Generation of the 3D structures was achieved with Modeller 9.13. To assess the structural quality of the obtained homology modeled targets, the models were validated using PROCHECK, PDBsum, QMEAN, and ERRAT scores. Molecular dynamics simulations were performed to calculate root mean square deviation (RMSD) and radius of gyration (Rg) of MurI and MurG target proteins and their corresponding templates. For further model validation, RMSD and Rg for selected targets/templates were investigated to compare the close proximity of their dynamic behavior in terms of protein stability and average distances. To identify the potential binding mode required for molecular docking, binding site information of all modeled targets was obtained using two prediction algorithms. A docking study was performed for MurI to determine the potential mode of interaction between the inhibitor and the active site residues. This study presents the first accounts of the 3D structural information for the selected M. tuberculosis targets involved in peptidoglycan biosynthesis.


Introduction
Tuberculosis, an infectious disease that mainly results from aerosol transmission of Mycobacterium tuberculosis, remains a serious global health threat. In 2014, 9 million people developed tuberculosis and 1.5 million died from the disease, of whom 360,000 were also HIVpositive (http://www.who.int/tb/publications/factsheet_ global.pdf). Peptidoglycan (PG) with a robust lipidic backbone in cell wall supports cell shape, install rigidity, and provides resistance to osmotic pressure (Alderwick, Birch, Mishra, Eggeling, & Besra, 2007). It is therefore a vital 'organelle' that is required for the survival and growth of virtually all bacteria, including M. tuberculosis (Alderwick et al., 2007). Lamichhane et al. (2011) reported that the pathway of peptidoglycan biosynthesis is essential for the growth of M. tuberculosis, with various products of the enzymes in this pathway, which are identified as essential metabolites. Several reactions are involved in cytoplasmic membrane and plasma membrane steps during the complex process of peptidoglycan biosynthesis ( van Heijenoort, 2001) (Figure 1). These steps include four major pathways: formation of (a) UDP-GlcNAc from fructose-6-phosphate, (b) UDP-Mur-NAc from UDP-GlcNAc, (c) formation of UDP-Mur-NAc-tetrapeptide or UDPMurNAc-pentapeptide resulting in the assembly of peptide stems, and (d) side pathways of D-glutamic acid and the dipeptide D-alanyl-D-alanine biosynthesis (Manca et al., 1999). These steps and pathways are presented in Figure 1. Biosynthesis of peptidoglycan in the plasma membrane stage includes assembly and polymerization of the disaccharide-peptide monomer (Bouhss, Crouvoisier, Blanot, & Mengin-Lecreulx, 2004). Polymerization reactions take place at the outer side of the cytoplasmic membrane through lipids I and II ( Figure 1) that have been dislocated through the membrane ( van Heijenoort, 2001). A recent project by our research group reported the homology modeled structures for various enzymes (GlmS, GlmM, GlmU, MurA, MurB, MurC, MurD, MurE and MurF) involved in the cytoplasmic steps of peptidoglycan synthesis of M. tuberculosis (Moraes et al., 2015). The purpose of this study was to obtain structural information about the seven enzymes whose 3D structures have not yet been reported, Glutamate racemase (MurI), UDP-diphosphomuramoylpentapeptide beta-N-acetylglucosaminyltransferase (MurG), phospho-N-acetylmuramoyl-pentapeptidetransferase (MraY), succinyl-diaminopimelate desuccinylase (DapE), dihydrodipicolinate synthase (DapA), D-alanine-D-alanine ligase (Ddl), and alanine racemase (Alr), (Figure 1).
The purpose of this study was to obtain structural information about the seven enzymes of which the 3D structures that are not yet reported [Glutamate racemase (MurI), UDP-diphospho-muramoylpentapeptide beta-N-acetylglucosaminyltransferase (MurG), phospho-N-acetylmuramoyl-pentapeptide-transferase (MraY), succinyl-diaminopimelate desuccinylase (DapE), dihydrodipicolinate synthase (DapA), D-alanine-D-alanine ligase (Ddl), and alanine racemase (Alr), (Figure 1). Important to note that the targets involved are from the CDC1551 strain of M. tuberculosis (Lamichhane et al., 2011). An unexpectedly high degree of genome sequence variation was determined between different tuberculosis strains with much of the variation being present in a large panel of clinical M. tuberculosis isolates (Bishai Figure 1. Pathways for selected enzymes involved in the intracellular part of peptidoglycan biosynthesis of M. tuberculosis are presented. Notes: The targets marked in blue and green are involved in cytoplasmic membrane and plasma membrane steps, respectively. The targets in black were studied by our research group in previous work (Moraes et al., 2015). The diagram was redrawn from Lamichhane et al. (2011Lamichhane et al. ( ). et al., 1999. CDC1551, as a strain involved in a recent cluster of tuberculosis cases, is known to be transmissible and virulent in humans, in contrast to H37Rv,  and is highly infectious (Moraes et al., 2015). CDC1551 has greater immune reactivity than H37Rv and other clinical strains (Fleischmann et al., 2002;Manca et al., 1999).
Apart from its key role in the biosynthetic pathway of the peptidoglycan, MurI interacts with DNA gyrase in some bacteria. MurI from Escherichia coli was shown to alter the function of the gyrase enzyme in the presence of peptidoglycan precursor and to serve as an activator for alanine racemase (Ashiuchi et al., 2002), while Bacillus subtilis was shown to interfere with DNA gyrase activity (Ashiuchi, Kuwana, Komatsu, Soda, & Misono, 2003). M. tuberculosis MurI exhibits these dual characteristics (Sengupta, Ghosh, & Nagaraja, 2008).
Glycosyltransferases (MurG family) employ an activated nucleotide sugar donor and include a series of enzymes with significant roles in several prokaryotes and eukaryotes biological processes (Koya et al., 1999;Verbert & Cacan, 1999). The PG synthesis in M. tuberculosis has been assumed to resemble the pathway established for E. coli (Ha, Walker, Shi, & Walker, 2000). The transfer of N-acetyl-D-glucosamine to lipid (II) is catalyzed by MurG. In all MurG homologs, there are several invariant residues that are located at or near a cleft between the two domains. These highly conserved residues in the different strains of bacteria makes them efficient for substrate binding or catalytic activity (Ha et al., 2000). Structural homology considerations suggest three binding sites namely: donor binding site, membrane association site, and acceptor binding site. Ile238, Ile245, Arg261, Glu269, Glu272, and Gln289 are involved in the donor binding sites that are located in the C-domain. The accep-tor binding site in the N-domain contains three highly conserved regions, two of which are glycine rich loops. The third one extends from the amino acids 126-138, which comprises an α-helix and β-sheet in the N-domain (Ha et al., 2000). The role of these residues offers an opportunity for further studies on the kinetic analysis of mutants. The membrane association site involves hydrophobic and electrostatic interactions with the negatively charged bacterial membrane. The hydrophobic patch consists of Ile75, Leu79, Phe82, Trp85, and Trp116 in the N-domain, which is encompassed by Lys69, Lys72, Arg80, Arg86, Arg89, and Lys140 as basic residues (Ha et al., 2000). MurG is the last enzyme in the intracellular phase of peptidoglycan biosynthesis III pathway, and leads to the formation of lipid II (Bugg & Walsh, 1992). This enzyme was studied as a promising target for new antibacterial agents (Mohammadi et al., 2007).
MraY is an integral membrane enzyme responsible for catalyzing the first membrane steps in bacterial cell wall PG synthesis III, namely transferring the phospho-N-acetylmuramoyl-pentapeptide motif into the undecaprenyl lipid (I) phosphate carrier. In vivo, functional assays and in vitro enzymatic assay of MraY from Aquifex aeolicus demonstrated that 14 fixed polar/charged amino acid residues, including Asp117, Asp118, Asp265, His324, and His325, are localized in the active site region, which are catalytically important (Chung et al., 2014). This target was successfully inhibited by several classes of natural product inhibitors (Bouhss et al., 2004;Shapiro, Jahic, Gao, Hajec, & Rivin, 2012;Winn, Goss, Kimura, & Bugg, 2010).
Diaminopimelic acid (DAP) is an essential building block in the lysine biosynthesis I pathway of peptidoglycan biosynthesis, and is unique to bacteria. Disrupting the DAP enzymes presumable leads to destabilization of the peptidoglycan layer, causing cell death. DapE enzymes are a member of the meso-diaminopimelate (mDAP)/lysine biosynthesis pathway, and exist in most of pathogenic Gram-negative and Gram-positive bacteria (Desmarais, Bienvenue, Bzymek, Petsko, & Holz, 2006;Gillner, Armoush, Holz, & Becker, 2009;Gillner, Becker, & Holz, 2013;Pavelka & Jacobs, 1996;Rowsell et al., 1997). Lysine is involved in the biosynthesis of proteins that are utilized in the peptidoglycan layer of cell walls in Gram-positive bacteria (Born & Blanchard, 1999;Scapin & Blanchard, 1998). mDAP products are important components of peptidoglycan cell walls of Gram-negative bacteria, which provide a link between the polysaccharide chains (Born & Blanchard, 1999;Scapin & Blanchard, 1998). Deletion of the gene encoding for this enzyme is fatal in H. pylori and Mycobacterium smegmatis (Karita, Etterbeek, Forsyth, Tummuru, & Blaser, 1997;Pavelka & Jacobs, 1996), which confirms it is crucial for bacterial cell survival. As there is no similarity between the biosynthesis pathway of DapE in bacteria and in mammal systems, DapE is an attractive target which can be inhibited by inhibitors with antimicrobial activities (Scapin & Blanchard, 1998). Two histidine residues, namely His67 and His349, function in the active site of DapE from the H. influenzae bacteria. DapE, N-succinyl-L,L-diaminopimelic acid desuccinylase catalyzes the hydrolysis of N-succinyl-L,L-diaminopimelic acid (SDAP) to L,L-diaminopimelic acid and succinate (Born, Zheng, & Blanchard, 1998;Davis et al., 2006) ( Figure 1).
DapA forms 2,3-dihydrodipicolinic acid by catalyzing an aldol condensation reaction between L-aspartate-βsemi-aldehyde and pyruvate (Kefala et al., 2008). This enzyme is involved in lysine and cell wall biosynthesis of most bacteria. Synthesis of lysine is carried out from aspartate produced in the diaminopimelate pathway that serves as an intermediate for both lysine and cell wall constituent biosynthesis (Curien et al., 2008). The catalytic function of DapA is based on the ping-pong mechanism (Dobson et al., 2005;Hutton, Perugini, & Gerrard, 2007), wherein pyruvate binds to a lysine amino acid residue in the active site. This intermediate carries out a condensation with L-aspartate-β-semi-aldehyde to produce dihydrodipicolinate. Three active residues include two Tyr and one Thr amino acids, forming a proton motif to facilitate this catalytic condensation (Dobson, Valegard, & Gerrard, 2004).
The peptidoglycan layer consists of a branched polymer of β-(1,4)-linked N-acetyl or N-glycolyl-muramic acid and N-acetyl-glucosamine moieties. (LeMagueres et al., 2005;Vollmer, Blanot, & de Pedro, 2008) Short peptide chains, such as tetrapeptide L-alanyl-D-γ-glutamyl-meso diaminopimelyl-D-alanine or pentapeptide L-alanyl-D-γ-glutamyl-meso diaminopimelyl-D-analyl-Dalanine, extend from the N-acetylmuramic acid component and form cross-links by generating peptide linkages in trans peptide bonds (Ghuysen, 1968). D-alanine-D-alanyl dipeptide is produced by subsequent actions of two enzymes, Alr and Ddl. Alr, a pyridoxal 5 ʹ -phosphate (PLP)-containing enzyme, catalyzes the racemization of L-alanine into D-alanine, which serves as a key building block in the biosynthesis of the peptidoglycan layer. As Alr is generally absent in higher eukaryotes, it is an attractive target for new antimicrobial drug design (Lambert & Neuhaus, 1972;Silverman, 1988). Alr from Salmonella enterica serovar Typhimurium (Esaki & Walsh, 1986;Wasserman, Daub, Grisafi, Botstein, & Walsh, 1984), Thermus thermophiles (Seow et al., 2000), and Shigella sp. (Yokoigawa et al., 2001) have been reported as monomers, while the crystal structure of Alr from Bacillus stearothermophilus is a dimer with two identical polypeptide chains (Shaw, Petsko, & Ringe, 1997). The dimer has two active sites that consist of residues from both monomers. Alr protein sequences from different bacteria share highly conserved residues within the active center domains. Lys39 in the N terminal domain, and Tyr265 residues of the C terminal domain have been characterized as essential residues in alanine racemization (Watanabe , Kurokawa, Yoshimura, & Esaki, 1999. The Lys residue acts as a catalytic base during the racemization of L-alanine to D-alanine (Watanabe, Kurokawa et al., 1999), and the Tyr ring stabilizes hydrogen bond interactions in the active sites with the D-alanine substrate and catalyzes the conversion reaction of D-alanine to L-alanine (Watanabe, Yushimura, Mikami, & Esaki, 1999).
D-alanine-D-alanine ligase (Ddl) catalyzes an essential precursor (D-alanine-D-alanine dipeptide) in the biosynthesis of peptidoglycan. It plays a key role in the D-Ala pathway of peptidoglycan cross-linking (Neuhaus, 1962), and is a noteworthy target for designing novel antibiotics. Ddl catalyzes the formation of D-alanine-D-alanine dipeptide before its addition to the peptide chain of peptidoglycan, which is as a crucial component of the cell wall. As a result, considerable effort has been made to discover new inhibitors against Ddl (Sova et al., 2009). Ddl has three binding sites including: the ATP binding site is located in a cleft between the C terminal and central domains, and the two D-alanine binding sites are located between the C-terminal and N-terminal domains. Amino acid residues Lys141, Phe183, Lys185, Val195, Asn214, Glu221, Ala222, Ala223, Val224, Glu228, Asp302, and Glu315 serve as catalytic residues in the binding sites of the Ddl enzyme (Sova et al., 2009). The catalytic mechanism of Ddl induces conformational changes in the C and N terminal domains, and loops during substrate binding (Kitamura et al., 2009). E. coli Ddl exhibits higher bacterial activity than H. pylori (Clausen, Huber, Laber, Pohlenz, & Messerschmidt, 1996;Wu et al., 2008).
In this context, homology modeling of the considered targets using the templates from their homologues (with known biophysical and biochemical properties) is an ongoing area of research interest (da Cunha, Barbosa, Oliveira, & Ramalho, 2010). Comprehensive reviews have provided extensive information about the homology modeling and its application in the process of drug discovery (Cavasotto & Phatak, 2009;Hillisch, Pineda, & Hilgenfeld, 2004;Honarparvar, Govender, Maguire, Soliman, & Kruger, 2014;Johnson & Pinto, 2004;Reeck et al., 1987;Yuriev & Ramsland, 2013). The ongoing need for identifying new 3D structures of these unknown enzymes motivated us to perform homology modeling of the considered targets using templates from their homologues with known biophysical and biochemical properties.
The quality of the obtained homology modeled structures was assessed by calculating the root mean square deviation (RMSD) of the superimposed targets/templates, Ramachandran plots, ProQ, Qualitative Model Energy Analysis (QMEAN), and ERRAT scores. Furthermore, to correlate the similarity of α-helical contribution between target and template, the secondary structure of all targets and templates were investigated using PDBsum database. Molecular dynamics simulations of the targets/templates for two cases, namely MurI/2JFU and MurG/1F0K, were used to investigate the similarity of the dynamic behavior in solution, as well as the protein average distance between the mentioned targets and their corresponding templates. As a test case, the modeled target (MurI) was used to map the potential binding site residues using different binding site prediction algorithms. The docking calculations were performed to analyze the binding mode of interactions and affinities of the predicted active residues of the modeled MurI enzyme with the selected inhibitors.

Materials and methods
2.1. Sequence retrieval of target proteins M. tuberculosis strain CDC1551 (Manca et al., 1999) was used as the reference strain, and on which all protein sequences in this study were based [PMID: 12218036]. The FASTA (Lipman & Pearson, 1985;Pearson & Lipman, 1988) sequences of target proteins MurI, MurG, MraY, DapE, DapA, Alr and Ddl were collected from the National Center for Biotechnology Information (NCBI) protein sequence database (Geer et al., 2010). With these obtained sequences, the Basic Local Alignment Search Tool (BLAST) (Mount, 2007) was used to find sequence similarity between the considered targets and templates from the Protein Data Bank (PDB) database (Berman et al., 2000). The BLAST tool compares a protein query against a protein sequence databases, and calculates the importance of the matches statistically. Following the selection of a suitable template PDB structure from the database, BLAST sequence analysis was performed on the target sequence. Selection criteria (Mount, 2007) considered in our homology modeling were based on good correlation with the selected crystal structures of the templates, in terms of their highest sequence identity with the better resolution and E-value (Expected value), using BLAST for building the 3D model of the targets. The E-value is a statistic parameter that accounts for the significance of the target/template alignment. A lower E-value is referred to as the probability of random chance of alignment (https://salilab.org/modeller/tu torial/basic.html).

Homology modeling
The homology modeling of the 3D protein structures was carried out using Modeller 9.13 (Eswar, Eramian, Webb, Shen, & Sali, 2008;Webb & Sali, 2014). This software generates comparative protein structures utilizing satisfaction of spatial restraints. The spatial restraints are obtained from a statistical analysis of the relationship between similar proteins structures statistically, as well as by investigating a correlation between two covalent C α -C α distances or dihedral angles covalent main chains from two related proteins (Fiser, Do, & Šali, 2000;Rabiller & Bayer, 2004;Šali & Blundell, 1993). Five preliminary models were generated for each enzyme and ranked based on their discrete optimized potential energy (DOPE) scores, which indicates the distance dependent statistical potential based on probabilistic theory (Shen & Sali, 2006). Only the best-fit model is reported herein, with those with the lowest DOPE score being selected for structural assessment. The apo forms of the following templates and their corresponding PDB IDs were used for producing the 3D structure of our selected targets: 2JFU-chain A from Enterococcus faecium, 4J72-chain A from A. aeolicus, 1F0K-chain A from E. coli, 3TX8chain A from Corynebacterium glutamicum, 4FHA-chain A from Streptococcus pneumoniae, 3RFC-chain A from Xanthomonas oryzae and 2DY3-chain A from C. glutamicum.

Protein quality assessment
The Protein quality predictor (ProQ) is a neural network tool that is based on a number of structural characters that calculates the quality of a protein model to find suitable models and native structures (Wallner & Elofsson, 2003). ProQ measures the structural quality using the Levitt-Gerstein (LG) score (Gerstein & Levitt, 1998) based on the following criteria: LG score ˃1.5 indicate reasonable models, LG score ˃3 shows a good quality model, and LG score ˃5 is an indication of an extremely high quality 3D structure (Wallner & Elofsson, 2003). The QMEAN score (Benkert, Tosatto, & Schomburg, 2008) is a scoring function of homology models that assesses the major geometrical aspects of protein structures of produced models. Acceptable QMEAN scores are between 0 and 1, where a 0 score is significant and a 1 is very significant (Benkert et al., 2008). PYMOL software (DeLano, 2002) was used to superimpose the structures of targets with their corresponding templates to obtain the RMSD between Cα atoms of the aligned structures.
The stereo chemical quality of the predicted model and model accuracy were estimated using Ramachandran plots (Hooft, Sander, & Vriend, 1997) computed by PROCHECK, (Laskowski, MacArthur, Moss, & Thornton, 1993), which classified the residues according to different allowed and disallowed regions in the quadrangle. The Goodness factor (G-factor) derived by the Ramachandran plots is a measure of normal or unusual properties of the stereochemical protein structure. PROCHECK computes the torsion angles and covalent geometry. A low G-factor indicates that the property corresponds to a low probability conformation with acceptable values of the G-factor being between 0 and −.5, and the best models display values being close to zero (Engh & Huber, 1991). The ERRAT (Colovos & Yeates, 1993) database was used to verify the nonbonded atomic interactions in the protein structures. Different types of atoms are distributed non-randomly in proteins based on the energetic and geometric effects. ERRAT analysis identifies the regions of error in protein structure by investigating the statictics of pairwise atomic interactions.

Comparison of target-template secondary structure
The target and template secondary structures were retrieved from the PDBsum (Laskowski, 2001(Laskowski, , 2009Laskowski, Chistyakov, & Thornton, 2005;Laskowski et al., 1997) database to investigate the similarity of the contribution of α-helical units between targets and templates, as well as the role of the similarity in the α-helices in the RMSD of superimposed targets/templates.

Molecular dynamics simulations
Both 10 and 20 ns molecular dynamics simulations were performed at 300 K in explicit water. RMSD of the backbone, with respect to the initial conformation, and Rg of the protein model targets/templates (MurI/2JFU and MurG/1F0K) were derived from the MD trajectory using the AMBER12 (Case et al., 2005(Case et al., , 2012 program with the AMBER FF03 force field (Ashiuchi et al., 2003). The TIP3P (Jorgensen, Chandrasekhar, Madura, Impey, & Klein, 1983) cubic box was set up, and the molecules were positioned at the center of the water box, setting a 8 Å distance between the protein and the box edge with the SHAKE20 algorithm (Ryckaert, Ciccotti, & Berendsen, 1977). Energy minimizations were carried out with the SANDER program in 10,000 steps. All simulations were performed at NPT at 300 k, a pressure of 1 atm and 2 fs for the time step. The resulting trajectories were analyzed using the Ptraj module (Jorgensen et al., 1983) implemented in the AMBER 12 package.
2.6. The binding site identification Metapocket (Huang, 2009;Zhang, Li, Lin, Schroeder, & Huang, 2011) and POOL programs (Somarowthu & Ondrechen, 2012) were used to predict the most promising binding site information and active residues of the modeled MurI target. Metapocket as a robust method that includes four programs namely SURFNET (Laskowski, 1995), Q-SiteFinder (Laurie & Jackson, 2005), PASS (Brady & Stouten, 2000), and LIGSITE csc (Huang & Schroeder, 2006) to detect clefts and cavities on the protein surface. POOL combines computed electrostatics, evolutionary, and pocket geometric information to predict catalytic residues and relates functional binding sites to sequence (Somarowthu & Ondrechen, 2012). This program estimates protonation states of residues by identifying ionizable groups in the catalytic site using input data from THEMATICS (Ko et al., 2005;Tong et al., 2008;Wei, Ko, Murga, & Ondrechen, 2007). This software also considers various features, such as geometric cleft size and sequence conservation scores with THEMATICS, which enhances the performance of this software (Somarowthu, Yang, Hildebrand, & Ondrechen, 2001). POOL predictor was trained to utilize INTERPID (Sankararaman & Sjolander, 2008) and ConCavity (Capra, Laskowski, Thornton, Singh, & Funkhouser, 2009) to identify the functional residues and binding pocket information in protein, respectively.

Molecular docking
Docking studies for the modeled target MurI were performed using the AutoDock 4.2 (Sanner, 1999) program, with two pyridodiazepine amines derivatives (inhibitor I and II) (Geng et al., 2009;de Jonge et al., 2015) being considered for the docking studies. Ligand minimization was performed using the steepest decent method and MMFF94S force field with the Avogadro program (Hanwell et al., 2012). AutoDock tools (Morris et al., 2009;Sanner, 1999) 1.5.4 were employed to determine the proper size of the grid box for the potential binding site (determined with Mertapocket and Pool). The grid box was determined as center (X = 17.13 Y = 57.7 Z = −.427) and dimension (X = 52 Y = 60 Z = 58) with the grid spacing of .375 Å. The Lamarkian Genetic algorithm (Morris et al., 1998) was used for molecular docking analysis using AutoDock 4.2 (Sanner, 1999) program. The illustration of hydrogen bond, polar, and hydrophobic interactions between MurI protein and the ligands was presented with the LigPlot + v.1.4.5 (Wallace, Laskowski, & Thornton, 1995) program. Docking score and visual inspection of the inhibitor relative to the active site were used to determine acceptable host-guest interactions.

Sequence alignment and homology modeling between targets and templates
This section describes the results for the 3D structures of MurI, MurG, MraY, DapE, DapA, Alr, and Ddl enzymes from M. tuberculosis, as determined by the Modeller program. The crystal structure of 2JFU-chain A (Lundqvist et al., 2007) from E. faecium was found to be the best template for MurI, with a sequence identity of 41%, a sequence similarity of 58%, a resolution of 1.8 Å and an Expect value (E-value) of 9e−61 (Figure 2). Figures  2(a)-(c) present the superimposition of the MurI/2JFU (target/template) and the secondary structures of 2JFU and MurI, which are discussed in more detail in Section 3.3. The 3D structure of the modeled MurI is presented as a colorful structure in Supplementary Figure S1.
Similarly, the results of BLASTp sequence alignment of MraY identified the crystal structure of 4J72-chain A (Chung et al., 2014) from A. aeolicus as a suitable template (E-value of 6e−48; sequence identity 35%; sequence similarity 56%; resolution of 3.3 Å). The  Figure S1 crystal structure of 1F0K-chain A (Brunger et al., 2012) from E. coli was chosen as the template for MurG (E-value of 8e−53, sequence identity 38%; sequence similarity 52%; resolution of 1.2 Å). The superimposition of the considered model targets/templates (MraY/4J72, MurG/1F0K) is demonstrated in Figure 3(a) and (b). The crystal structure of 3TX8-chain A (Ha et al., 2000) from C. glutamicum was found to be the most suitable template for DapE (E-value of 5e−125; sequence identity 55%; sequence similarity 69%; resolution of 2.97 Å), which is presented as the superimposition model of target/template in Supplementary Figure S3(c). The crystal structure of 4FHA-chain A (Berman et al., 2000) from S. pneumoniae was characterized as the template for DapA (E-value of 5e−55; sequence identity 38%; sequence similarity 55%; resolution 1.88 Å). For further evidence, homology modeling was performed between DapA from strain CDC1551 and chain A 1XXX from H37Rv, which exists as a dimer (Kefala et al., 2008). The target/template overlays of the two cases (DapA/ 4FHA and DapA/1XXX) are shown in Supplementary Figure S3(d). Superimposition of the target/template DapA/4FHA gave a RMSD of .21 Å (Table 1), while target/template DapA/1XXX gave a RMSD of 1.4 Å, which is much higher (i.e. less feasible). It is remarkable that the existing homology modeling code/algorithms (Modeler 9.13) do not consider 1XXX as the best template to create the 3D structure for the same enzyme (DapA) of CDC1551 (despite a more than 99% sequence identity according to the BLAST server). This superimposition is provided in Supplementary Figure S3(d).
The crystal structure of 3RFC-chain A (Berman et al., 2000) from X. oryzae for Ddl (E-value of 8e−89; sequence identity 44%; sequence similarity 58%; resolution of 2.1 Å), and the crystal structure of 2DY3-chain A (Berman et al., 2000) from C. glutamicum were selected as templates for Alr (E-value 8e−84; sequence identity 42%; sequence similarity 63%; resolution of 2.1 Å). The superimposition of the structures for the Ddl/3RFC and Alr/2DY3) is presented in Supplementary Figures S3(e)-(f). All the produced 3D structure models of our considered targets are provided in Supplementary Figures S1, S3(c)-(g).
The superimposition of the considered model targets/ templates, (MraY/4J72, MurG/1F0K) is demonstrated in Figure 3(a) and (b). The 3D structures of the modeled MurY and MurG are presented in Supplementary  Figure S3(g) as colored structures.

Model quality assessment
The RMSD between the Cα atoms of the modeled target/ template were MurI/2JFU .2 Å (Figure 2 Table 1. These low RMSD values indicate the reasonable structural similarity between the targets and their corresponding templates, and confirm the validity and success of our homology modeled structures. The calculated parameters related to the model quality assessment for all the modeled proteins are shown in Table 1. According to the reported data in Table 1, the considered targets can be classified into three groups, based on the model quality (LG score). Extremely high quality models had LG scores ˃5 in the modeled targets MurI, DapE, and Ddl, good models had LG score ˃3 in the modeled MraY, DapA, and Alr and reasonable models had LG scores ˃1.5 in the MurG target. In addition, the QMEAN scores for the all modeled targets were between 0 and 1, which are significant and within acceptable range (Benkert et al., 2008). The modeled MurI target (with QMEAN of .79 and LG score of 6.18) is of high quality and that of MurG (with a LG score of 1.98 and a QMEAN score of .63) is of acceptable quality.
An alternative structural PROCHECK analysis was performed using Ramachandran plots. The results indicated that the residues of the modeled proteins in the most favored Ramachandran plot regions were DapE (90.5%), MraY (91.8%), MurG (91.1%), MurI (94.7%), Ddl (90.6%), DapA (89.7), and Alr (84.1%), and in the additional allowed regions, were 8.2, 6.2, 5.9, 4.9, 8, 7.5, and 12.9%, respectively (Table 2). These results revealed that the stereochemical quality of the predicted models were found to be good (Wallner & Elofsson, 2003), with a high percentage of residues having phi/psi angles in the allowed regions. The Ramachandran plots of all considered targets were generated using PRO-CHECK (Laskowski et al., 1993) (Figure 4). The Ramachandran plots of the other model target/template proteins are provided in Supplementary Figure S4.
The acceptable overall average values of the G-factor in PROCHECK are between 0 and −.5 (Engh & Huber, 1991). The total quality G-factor for all the cases confirmed the high structural quality of the considered models. The parameters relating to the main-chain and side-chain of the homology modeled structures did not reveal any bad contacts or unsual scores (Table 2).
ERRAT plots (Laskowski, Watson, & Thornton, 2005), which refer to the pairwise atomic interactions, is a so-called 'overall quality factor' for non-bonded atomic interaction, with higher scores indicating a reasonably high quality. In the ERRAT plots, black bars identify the miss-folded regions located distantly from the active site, gray bars demonstrate the error region between the confidence limit 95-99%, and white bars indicate the region with the least errors caused by the folded regions. ERRAT scores of over 50 are acceptable for high quality models . In this study, the ERRAT score for the MurI modeled protein was 92.37 and lies in an acceptable normal range ( Figure 5 and Table 2), and implies that the backbone conformation and non-bonded interaction of MurI model were acceptable. The observed ERRAT scores for the other homology modeled targets were between 76 and 94% (Table 2), with the higher score indicating a higher quality.
The dynamic behavior of the target/template for MurI as a model with high quality, and the MurG as a lower quality model, were further investigated using molecular dynamics studies (Section 3.4). If these two model enzymes demonstrate reasonably close dynamic behavior in terms of similarity in protein stability and structural compactness with their corresponding templates, the  *Lower than −.5: unusual; lower than −1.0: highly unusual (Engh & Huber, 1991). expectation is that the homology models were indeed of good quality. It is therefore likely that the rest of the modeled enzymes will also be of good quality, as the various quality measures indicated (LG score and Ramachandran regions). The ERRAT score for the MurI modeled protein is presented in Figure 5 and Table 2.
The obtained Ramachandran plot and ERRAT results of the remaining modeled targets are provided in Supplementary Figure S5(A)-(F).

Protein secondary structure analysis
The secondary structure features of targets and templates in this study were taken into consideration and are presented in Supplementary Table S3. This table indicates the contribution of residues in helical and β-strand forms in terms of the obtained secondary structures. α-Helices are the most common secondary conformations in natural proteins (Ruan, Chen, & Hopkins, 1990), and are the bioactive regions of numerous proteins, playing critical roles in many protein-protein, protein-DNA, and protein-RNA interactions (Fairlie, West, & Wong, 1998). Comparable values of α-helices between target and template lead to a similar topology of target to its template and affect superimposition, as well as the RMSD values of the aligned targets/templates (Chaitanya, Bhaskar Reddy, Minakshi, Zaveri, & Kaladhar, 2013;Gundampati et al., 2012). The observed lowest and highest values of RMSD for the superimposed targets/templates, namely MurI/2JFU and Ddl/3RFC, were .20 and .44 Å, respectively. There were differences in the amino acids  contribution of α-helices between each target and it's corresponding template, which were not significant (Supplementary Table S3). The smaller difference of the α-helices contribution between each target/template showed better RMSD values. In other words, a comparison of the highest and lowest RMSD (of superimposed targets/templates), and the similarity in the contribution of α-helical units between targets and templates, revealed a meaningful correlation between the similarity of α-helical units for targets/templates and their RMSD, which is related to the aligned targets/templates (RMSD). It is therefore likely that the mode of action of the modeled structure should be similar (if not identical) to that of the template enzyme from different bacterial homologous.
The comparison of the secondary structures between each of the pairs (targets/templates) is demonstrated in Supplementary Table S3. This table indicated that in the case of MurI, 120 residues (44.3%) were in helical form, 44 (16.2%) were strands, and 107 residues (39.5%) were in other conformations (Figure 2(c)). In a similar way, the PDBsum secondary structure of its template-2JFU showed that 117 residues (43%) were helices, 45 (16.5%) were strands, and 110 residues (40.5%) were in other conformations (Figure 2(b)). Identification of each α-helices, strands, and the other conformations between target and template are presented in Supplementary Table S3 and Supplementary Figures S2(d)-(o). The same comparison was made between the rest of modeled targets/templates. The conservation of the secondary structure revealed the reliability of the proposed model that was obtained using Modeller, being based on the target-template alignment.

Molecular dynamics simulations
Based on literature (Elengoe, Naser, & Hamdan, 2014;Rohini & Srikumar, 2013;Satpathy, Behera, & Guru, 2012), the RMSD and Rg of model targets/templates (MurI/2JFU and MurG/1F0K) over 10 and 20 ns MD trajectory were calculated. The obtained results were compared to achieve a deeper insight into the extent of similarity in protein stability and protein average distance between targets/templates. For the 20 ns, the MD run RMSD and Rg analysis were performed. The modeled targets, MurI with the highest LG score and most favored region of the Ramachandran plot, and the MurG model with lowest LG score and highest disallowed region in Ramachandran plot, were selected for this purpose. The calculated RMSD over the 20 ns MD trajectory at 300 K for the modeled MurI/2JFU (target/template) were within the range of .5-2.5 Å, and showed the reasonable closeness between the target and template's structural stability. This also reconfirmed the performance of the homology modeled structures (Figures 6 and 7). The same trend was observed for the MurG/1F0K target/template; the RMSD values of the modeled MurG and 1F0K template were between 1 and 4.5 Å, which confirms the close structural stability between the MurG/1F0K and the target/template (Supplementary Figures S6 and S7). The Rg, between the modeled protein MurI and its template, 2JFU, fluctuated around 1 Å, which indicated a close variation of the protein average distance and protein structure compactness between the MurI target and the 2JFU template. Similarly, the measured Rg of the target/ template MurG/1F0K, showed variations of less than 1 Å over 10 and 20 ns MD runs, which confirmed the close proximity of the protein average distance between the MurG modeled target protein and the 1F0K template. It is clear from the 10 ns MD-based RMSD and Rg calculations (provided in supplementary information) that  the shorter simulation times are sufficient to compare the stability and compactness for these systems (Supplementary Figures S6-1, S6-2, S7-1 and S7-2). The RMSD plots of the modeled MurI (target) and 2JFU (template), as well as the other target/template (MurG/1F0K) over the 20 ns MD trajectory, are presented in Figure 6 and Supplementary Figure S6, respectively.
The plot of Rg of the protein model targets/templates (MurI/2JFU and MurG/1F0K), which implies protein average distance, is presented in Figure 7 and Supplementary Figure S7, respectively.

Identifying the binding sites
The potential binding sites of the considered MurI target were characterized using Metapocket (Huang, 2009) and POOL (Somarowthu & Ondrechen, 2012), and are presented in this section. Five possible binding sites were displayed and listed in Figure 8(a) and (b) using Metapocket (Chetty & Soliman, 2015). The top rating 51 active amino acids as binding site residues, as derived by the POOL program (Chetty & Soliman, 2015), are marked in yellow in Figure 9(a) and are listed in Figure 9(b). An analysis of the results from the two programs indicated that 40 amino acids were common in both Metapocket and POOL programs, which is shown in a cartoon representation, Supplementary Figure S8. Among the five potential binding sites predicted by Metapocket, site 1 was found to be the most plausible site (based on the docking results). The potential binding sites obtained by the Metapocket program for the remaining modeled enzymes are provided in Supplementary Figures S8-1 to S8-6. A comparison of the results from the Metapocket (Huang, 2009) and POOL (Somarowthu & Ondrechen, 2012) program enabled us to affirm the vicinity of the potential binding site residues. Due to a lack of detailed structural information about the binding sites for the template of the MurI target (PDB code: 2JFU) from E. faecium, the binding sites of MurI from H. pylori was studied to compare our gained potential binding sites with experimental studies (Lundqvist et al., 2007). Interestingly, the experimental results were in good agreement with the identified binding sites using both Metapocket and POOL programs, and gave confidence to further validating our predicted potential binding sites. An experimental study on H. pylori (MurI) showed that active site residues of this bacteria involves Phe13, Lys17, Val10, Glu150, Ser152, Cys181, His183, Leu186, and Trp252 (Lundqvist et al., 2007).
According to the available experimental evidence (Lundqvist et al., 2007), and close similarity of the identified binding sites between site 1 using the Metapocket program and 51 top-ranked binding residues, site 1 was considered the most suitable potential binding site (Figures 8 and 9). Most of the top-ranked residues, including 40 common amino acid residues corresponding to binding site 1, were predicted using the POOL software (Figures 9(a), (b), and S8). The potential binding sites obtained by the Metapocket program for the remaining modeled enzymes are provided in Supplementary Figures S8-1 to S8-6. 3.6. Molecular docking analysis Pyridodiazepine amines derivatives are effective inhibitors against the MurI enzyme and inhibit the enzyme by suppressing its growth (Geng et al., 2009;de Jonge et al., 2015). To explore the binding mode of the two selected ligands (Figure 10) with the active residues of the MurI protein, a comprehensive molecular docking study was performed to identify the most plausible predicted binding pocket of the MurI target, which was identified in the previous section (Ramalho et al., 2011).
The binding mode of the top-ranked-docked inhibitor I and II conformers with possible hydrogen bonding and hydrophobic interactions with MurI obtained using Lig-Plot + are displayed in Figure 11.
The amino acid residues involved in the binding sites of the MurI target and its electrostatic interactions with inhibitor I and inhibitor II ligands are listed in Table 3.
The mode of interaction for the two docked benzodiazepine amine ligands (inhibitor I and inhibitor II) with active residues of MurI is displayed in Figure 11. Among the several docked conformers that were obtained, the most probable modes of interaction were selected (visual inspection) in close proximity of the predicted active residues. These were found to be in good agreement with experimental structural binding site evidence (Lundqvist et al., 2007). The Val15, Gly16, Thr19, Val149, Val152, Glu153, Cys185, Thr186, His187, Thr188, Leu190, Leu250, Leu250, and Phe254 amino acid residues are involved in the docked conformers (Table 3 and Figure 11). According to the mode of interaction displayed for the docked structures (Figure 11), inhibitor I formed a hydrogen bond (H-bond) with His187 in the active site. The lipophilic cyclopropyl group of this inhibitor was in contact with amino acid residue Val149. Gly16, Gly17, and Leu190 were observed to have hydrophobic interactions with lipophilic parts of inhibitor I. For inhibitor II, the sulfur atom formed two hydrogen bond interactions with Cys185 and Figure 10. Chemical structures of the selected benzodiazepine amine derivatives are presented. Figure 11. Schematic diagrams of hydrophobic and hydrogen bond interactions for the docked benzodiazepine amine derivatives (inhibitors I and II) inside active pocket of the modeled MurI mapped using LigPlot + software.
Thr186. The aromatic rings of the inhibitor displayed hydrophobic interactions with amino acid residues Phe254, Leu250, and Val15. The negative and low value of the binding energy, as well as the most potential hydrogen bonding and hydrophobic interactions, showed strong and most favorable binding interactions between the protein and ligand molecules. Agreement between the experimental evidence of active site residues, the predicted binding site amino acid residues and the docking results served to confirm the position of the most feasible binding site of the modeled MurI enzyme.

Conclusion
In this study, the homology-based 3D models of the selected enzymes involved in the biosynthesis of the peptidoglycan cell wall in M. tuberculosis were constructed using Modeller 9.13. Acceptable structural LG, QMEAN, ERRAT scores, and satisfactory percentages of residues with phi/psi angles in the allowed regions of the Ramachandran plots demonstrated the robustness of the predicted homology modeled structures. It was also found that a contribution of α-helical units between targets and templates display an essential role in the structural similarity and consequently superimposition of homology modeled targets with their corresponding templates. The comparison of MD-based RMSD, and Rg values of the MurI and MurG targets and their corresponding templates over the 20 ns MD simulations confirmed the close proximity of the protein stability and protein average distances. It is important to note that no essential difference between the 10 ns MD and the 20 ns results were observed, suggesting that a 10 ns MD analysis is sufficient for similar size proteins. The identified potential binding pocket, using two binding site predic-tors, showed satisfactory similarity with the catalytic site reported experimentally for MurI from H. pylori and confirms the locality of the possible binding site of MurI target. The best docked conformers were selected based on the lowest binding energy and highest possible hydrogen bond and hydrophobic interactions of ligand molecules inside the active pocket of MurI target.
These additional theoretical results are helpful to determine structural information related to the obtained homology modeled structures as potential promising therapeutic targets to combat tuberculosis. It provides an efficient blueprint to identify the structural basis of potential binding pocket of these specific unknown enzymes.

Supplementary material
The supplementary material for this paper is available online at http://dx.doi.org/10.1080/07391102.2015.1117397.

Disclosure statement
No potential conflict of interest was reported by the authors. Table 3. The close contact residues that interact with the binding site of the modeled MurI are presented. (All these residues were included in site 1 in Figure 7(b).)