Homology modeling and virtual screening of ubiquitin conjugation enzyme E2A for designing a novel selective antagonist against cancer.

Abstract Cancer is a major health problem in the world. The initiation and progression of cancer is due to imbalance between the programmed cell growth and death. These processes are triggered by the ubiquitin family enzymes. The ubiquitin-like proteins are responsible for the cell metabolism. Ubiquitin-dependent proteolysis by the 26s proteasome plays a crucial role in cell cycle progression as well as in tumorigenesis. In the ubiquitin proteasomal degradation pathway, ubiquitin conjugation enzyme E2A (UBE2A) binds with ubiquitin ligase RAD18, results in polyubiquitation reaction and cell cycle progression. UBE2A is an important contributing factor for the control of tumorigenesis. In the present work, the 3D model of the protein UBE2A was generated by homology modeling technique. The generated 3D structure of the UBE2A was validated, and active site was identified using standard computational protocols. The active site was subjected to structure-based virtual screening using small molecule data banks, and new molecules were identified. The ADME properties of the new ligand molecules were predicted, and the new ligands are identified as potent UBE2A antagonists for cancer therapy.


Introduction
Cancer control, since decades, remains a global problem, despite advances made in medicinal chemistry research (1). Cancer involves uncontrolled cell proliferation, invades nearby tissues and spreads through the blood stream, lymphatic system to other parts of the body (2). Cancer is a leading death cause worldwide and accounts for 13 million victims per year, and is expected to rise to 22 million annually within the next two decades (3)(4)(5). The ubiquitin proteins act as signaling messengers and control the cell cycle progression, including the post-DNA transcriptional modification, apoptosis and cell proliferation (6)(7)(8)(9)(10). The ubiquitin-mediated proteasomal degradation pathway is one of the novel drug targets of cancer therapy (11,12). The ubiquitin-mediated proteasome degradation pathway mainly involves three enzymatic reactions ( Figure 1). Initially, the ubiquitin binds to ubiquitin-activating enzyme (E1) in the presence of ATP to form a thio-ester bond, and the activated ubiquitin is transferred to the ubiquitin conjugation enzyme (UBE2A). This reaction is followed by the covalent interaction of ubiquitin to the "-amino group of one or more lysine residues of the target proteins which are in conjugation with ubiquitin protein ligase (RAD18). The process of ubiquitination is repeated by the attachment of another ubiquitin to form a polyubiquitin chain, which is essential for an effective recognition by the 26s proteasome leading to cell cycle progression (13,14). The negative regulation of ubiquitinmediated proteasomal pathways leads to the development of different human diseases (15). In this pathway, ubiquitin ligase proteins act as carrier proteins, which control the cell cycle progression and tumor formation (16). In eukaryotes, the cell cycle progression is mainly dependent on the cyclindependent kinases (CDKs), phosphorylation and ubiquitin proteasomal degradation pathway (17,18). The negative regulation of the CDKs effects the ubiquitin proteasomal degradation process and forms tumors (19). The UBE2A protein is a RAD6 yeast homolog HR6A protein, which interacts with the ubiquitin ligase RAD18, involved in polyubiquitination reaction and cell cycle progression (20)(21)(22). Nowadays, different types of proteasomal inhibitors are available (23), nevertheless to inhibit the initial stage of cancer, the interaction between the UBE2A protein with its natural substrate ubiquitin ligase RAD18 is a novel drug designing target.

Materials and methods
The protein structure availability is essential for understanding the biological processes and functions at molecular level.
Protein structures can be determined at high resolution by either experimental methods such as X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy or by computational methodology (24).

Homology modeling
The X-ray crystal and NMR structures are not reported for ubiquitin conjugation enzyme E2A (UBE2A). The 3D structure of UBE2A was generated using comparative modeling   techniques. The FASTA sequence of UBE2A protein sequences with accession code (P49459) was retrieved from the ExPASy proteomic server (UniProt) (25). The sequence was subjected to the Basic Alignment Tool (BLASTp) program (26,27), the PHYRE (28) and the JPred3 servers (29) by setting each server to search for the protein template. The template was selected based on the criteria of sequence identity and a statistical measure of e-value obtained from the sequences. Pairwise alignment of protein UBE2A with selected template protein (PDB ID: 1JAS) was performed using ClustalW (30) server to define the structurally conserved regions. The Gonnet matrix algorithm is used in ClustalW server, for the automated pairwise alignment to ensure the presence of conserved motifs and to minimize the atomic gaps (31). The 3D model of UBE2A protein was generated using homology modeling software MODELLER 9v7 (MODELLER 9v7, San Francisco, CA). Thirty initial models were generated, and the model with the least modeler objective function value was selected for structure refinement in the structurally variable regions (32,33). The RMSD value for the UBE2A protein and its template were calculated in the Swiss-Pdb Viewer (SPDVB_4.1.0) (34).

Protein preparation and model validation
The energy minimization for the 3D model of UBE2A protein was carried out using protein preparation wizard in Maestro 9.1 (Maestro v 9.1 Schrodinger, LLC, New York, NY). The energy minimization of UBE2A protein was performed using an all-atom Impact Refinement module (Impref) (Impact v 5.0, Schrodinger LLC, New York, NY), to adjust steric clashes, bond orders were assigned, water molecules were removed, hydrogen bonds were added and the default root mean square deviation (RMSD) value of 0.30 Å was assigned using OPLS_2005 (Optimized potentials for liquid simulations) force field (35)(36)(37). The refined and energy minimized 3D model of UBE2A protein was evaluated using validation programs PROCHECK (38,39), ERRAT (40), VERIFY_3D (41), available from the Structural Analysis and Verification Server (SAVES) and ProSA server (42). In PROCHECK server, Ramachandran plot elucidates the steriochemical quality of the resulting protein structure. The ERRAT program describes the overall quality factor of UBE2A protein, and the acceptable range of ERRAT program is above 50%. The Verify_3D gives the 3D structure quality of the The 3D structure of the UBE2A protein shows four alpha helices and four beta sheets, obtained from the MODELLER 9v7 program. Figure 3. Sequence alignments of UBE2A with the template protein 1JAS chain A. Pairwise sequence alignment carried out with Clustal W server: the conserved residues are represented as (*), and highly similar residues as (:), similar residues as (.). Alignment of the UBE2A with the template protein gave 95% sequence similarity.
protein. The ProSA server calculates the native fold, pairwise energy and Z-score of the modeled 3D structure of UBE2A.

Active site prediction
Active site is a ''specific binding site'' in the protein, which is involved in the catalytic activity. The active site region of the protein UBE2A was identified using CASTp (Computed Atlas of Surface Topography of Proteins) server (43) and Q-site server (44). The protein-protein docking was carried out using ZDOCK 3.0.1 software (45), to identify the specific binding residues within the binding domain of UBE2A protein. Based on the active site and important binding residue predictions, the active site grid was generated using the receptor grid generation module of Glide module (Glide, version 5.6, Schrödinger, LLC, New York, NY, 2010).

Virtual screening
Virtual screening was carried out with a set of small molecules at the active site region of UBE2A. LifeChem and CB_NovaCare small molecule structural database ligands were prepared to perform the virtual screening work flow in the Schrödinger suite. The ligand 3D coordinates were generated using LigPrep (ligand preparation) module in the Schrödinger suite using OPLS_2005 force field. The process of ligand preparation involves conserving the specified chiralities to generate minimum five stereoisomers per ligand, using default conditions at pH 7.0 ± 2.0. Several conformers were obtained after the completion of ligand preparation. The virtual screening program of protein UBE2A using LifeChem and CB_Nova Care structural databases with active site grid was performed by three scoring protocols of docking with Glide; the High Through Put Virtual Screening (HTVS), standard precision (SP) and extra precision (XP) (46). The ligand molecules were identified and prioritized based on the glide score and glide energy. The ADME properties of ligand molecules were calculated using QikProp module (47)

Results and discussion
Structural analysis of UBE2A protein The experimental structure of UBE2A protein is not available in the Protein Data Bank, to study the structural properties. Comparative modeling techniques were applied to generate the 3D structure of UBE2A protein. The UBE2A protein is a carrier protein belonging to the ubiquitin conjugating enzyme family and has 152 amino-acid residues. The protein conserved domains and E3 binding (ubiquitin ligase) interactions were identified using BLAST server (26,27) as shown in Figure 2. The protein template search was carried out by BLAST, JPred3 and PHYRE 0.2 servers, the results of which are shown with their corresponding e-values in Table 1. A low e-value indicates a high-protein specificity and sequence identity (26,27). The UBE2A protein aligned with the selected template protein (PDB code: 1JAS) show a 95% of sequence similarity, (Figure 3) using ClustalW server. Thirty models were generated initially using MODELLER 9V7, and the modeler objective function value was considered for structure refinement (49). The RMSD value of UBE2A protein with its template protein UBE2B is 0.23 Å , which is calculated using the Swiss-Pdb Viewer (SPDBV_4.1.0, the permissible range of RMSD for any protein being 2 Å ). The energy minimization of the protein was carried out using protein preparation wizard in Schrödinger suite. The total energy of the protein after optimization was 2.809 Â 10 2 kcal/ mol, and after protein minimization the value was À2.642 Â 10 2 kcal/mol. The 3D model of the UBE2A protein evaluated is shown in Figure 4 and is used for further studies. The UBE2A protein has 4a-helices and 4b-strands ( Table 2). The stereochemical analysis of the protein UBE2A was carried out using the Ramachandran plot, which is represented as a contour diagram of the backbone dihedral angles [Phi () versus Psi (É)] as shown in Figure 5. The plot statistics shows 95.4% of amino acid residues (124 AA) in most favored regions, 4.6% of amino acid (6 AA) in additionally allowed regions, and none in the generously allowed region and disallowed regions (more than 90% of amino residues are in the most favored region; therefore the generated 3D model is a good quality model). The UBE2A protein structural compatibility as explained by the Verify_3D plot ( Figure 6) gives a value of 100%, infers this model as a good structure. The overall quality factor value of protein UBE2A is 73.61% calculated using ERRAT program. The structure does not show any bad contacts, and the Figure 7. The overall quality value of UBE2A protein using ERRAT program. The UBE2A protein has overall quality factor of 73.61% explained by ERRAT program. On the axis, two lines are drawn to indicate the confidence with which it is possible to reject regions that exceed the error value. Expressed as the percentage of the protein for which the calculated error value falls below the 95% rejection limit. Good high-resolution structures generally produce values around 95% or higher. For lower resolutions (2.5-3.5), the average overall quality factor is found to be 91%. model quality details are shown in Figure 7. ProSA energy plot explains the local quality of the modeled structure of the protein. Amino-acid residues with negative ProSA energies are more reliable, and most of the amino-acid residues in the UBE2A protein have negative energies as shown in Figure 8(a). The overall Z-score of UBE2A protein is À7.3 as shown by the Dark spot in Figure 8(b). It falls in the range of the Z-score of the existing protein structures deposited in PDB, determined either by NMR or X-ray techniques. The evaluated 3D structure of the UBE2A protein reveals a good quality model.  Amino acid residues are represented in three letter code with its position in protein. The three binding cavities were identified in CASTp and Q-site servers based on the hydrophobicity and the ligand binding site prediction, respectively.    The active site is unique in defining the activity of a protein in biochemistry. The active site residues were identified using CASTp, Q-site servers and protein-protein docking studies. The UBE2A protein binding sites and corresponding binding pockets' volumes were identified based on the hydrophobicity of residues using CASTp server. Q-site server explains the interaction energy between the protein and a simple van der Waals probe to locate energetically favorable binding sites of the UBE2A protein. Similar protocols were reported for identification of the active site (50)(51)(52). The ligand binding site prediction for UBE2A from the CASTp and Q-site tools is shown in Table 3.
The protein-protein docking of UBE2A with its natural ligand, namely the ubiquitin ligase RAD18 protein was carried out using ZDOCK 3.0.1 software to identify the binding residues (Figure 9). The inter molecular interactions of UBE2A protein docked complex with the ubiquitin ligase RAD18 are shown in Table 4, visualized in Accelrys Discovery Visualizer (Accelrys Software Inc., 2012 Accelrys Discovery Studio Visualiser v 3.5 Accelrys Software Inc., San Diego, CA). The intermolecular interactions gave the information regarding binding residues of UBE2A protein.
In the docked complex of the UBE2A protein with the RAD18, the residues S-25, E-44, G-45 and D-50 are significant for ubiquitination process and its activity. The UBE2A protein and template belong to the same phylogenic family. All the members of ubiquitin-conjugating enzyme family contain conserved motifs in similar binding domains (Figure 2), including UBE2A and its template 1JAS proteins. The active site prediction results of UBE2A protein obtained from CASTp, Q-site servers and protein-protein docking studies were corroborated and compared with that of the template active site region (Supplementary Figure 1). The UBE2A protein active site domain is predicted to be L16 to G26; V39 to F53; F72 to F77 and S148-D151. Keeping these results in view, the active site of UBE2A for the next step of virtual screening process was considered.
Virtual screening and ADME properties of UBE2A protein A grid was generated considering the centroid of the residues (as discussed in the previous section) in the active site region with 80 Å Â 80 Å Â 80 Å dimensions for further virtual screening studies. The structure-based virtual screening was performed with a library of molecules prepared from the ligands of LifeChem and CB_Nova Care structural data banks. LifeChem databank containing 12 000 molecules was subjected to ligand preparation. The stereoisomers with specific chiralities were retained to generate five conformers per ligand. The stable conformers were retained and output of 20 658 stereoisomers generated. The generated conformers were subjected to the HTVS mode with output of 6790 structures. The output structures of 6790 were further screened successively in SP and XP modes with a filtration ratio of 10% at each stage of docking. This process gives an output of 67 docked structures. A similar protocol followed for LifeChem database was applied to the virtual screening of the CB_Nova Care. The CB_Nova Care molecular structural database contains 10 000 molecules and gave an output of 36 935 conformers after ligand preparation and further virtual screening through HTVS, SP and XP modes gave an output of 92 docked structures. These structures can be considered for the identification of potent ligands as antagonists for UBE2A protein. A total of 159 docked complexes obtained show a Glide score in the range À8.5 to À6.4. A sample of 12 best docked ligand molecules are shown in Table 5 based on the XP glide docking score and glide energy. The hydrogen bond, cation interactions of protein UBE2A-ligand complexes are shown in Figure 10. The protein and ligand interactions were visualized using Accelry Discovery Visualizer (Accelrys Software Inc., 2012 Accelrys Discovery Studio Visualiser v 3.5. Accelrys Software Inc., San Diego, CA) and PyMOL 1.3. The SASA describes the potential binding region in the protein structure (53). The SASA values of the protein UBE2A and ligand molecules were analyzed before and after docking (Figure 11), setting the probe radius value to 1.40 Å and using 240 grid points per atom with the help of Accelrys Discovery Studio Visualizer (Accelrys Software Inc., 2012 Accelrys Discovery Studio Visualiser v 3.5. Accelrys Software Inc., San Diego, CA). The resulting ligand molecules were subjected to QikProp module of Schrodinger suite to calculate the ADME properties based on Lipinski's rule of five, Jorgensen rule of three and different parameters as shown in the Table 6. The ligand molecules inhibit the UBE2A protein and ubiquitin ligase RAD18 interactions to control the cancer.

Conclusion
The protein UBE2A plays a significant role in ubiquitinmediated proteasomal degradation pathway. The comparative     The permissible ranges are as follows: modeling studies of UBE2A protein gave insight into the structural details of protein and the active site. Virtual screening with a library of molecules against UBE2A protein shows the amino acid residues Ser25, Phe41, Asp50, Thr52, Ser 74, Ser148 and Asp 151 to be important binding residues. The present study helps in the identification of potent molecules for cancer therapy.