Structural analysis of alternate sigma factor ComX with RpoC, RpoB and its cognate CIN promoter reveals a distinctive promoter melting mechanism

Abstract Alternate sigma factors play a major role in the survival of pathogenic bacteria such as Streptococcus pyogenes in adverse environment conditions. Stress induced sigma factors mediate gene expression under conditions of pathogenesis, dormancy and unusual environmental cues. In the present work, ComX, an alternate sigma factor from S. pyogenes has been characterized. The structures of ComX, RpoB β subunit and RpoC β’ subunit of RNA polymerase have been predicted using comparative and homology modelling respectively and validated. Attempts have been made to study RpoB-RpoC-ComX complex interactions with Double Strand (DS) and Single Strand (SS) pro moter regions. Stability of these complexes and the promoter melting mechanism have been analysed using Molecular Dynamic (MD) simulations. This study suggests that ComX, although identifies promoter analogous to the alternate sigma factor SigH of M. tuberculosis, follows a distinctive promoter flip out mechanism. Communicated by Ramaswamy H. Sarma

The transcription process in bacterial system is initiated by the $400 kDa core RNA polymerase along with a dissociable sigma factor for promoter recognition, forming a holoenzyme complex (Goldman et al., 2009;McClure et al., 1978;Murakami et al., 2002). Among all the major subunits of RNA polymerase complex, sigma factor provides a potent response for cellular signals and mediates synthesis of essential proteins for bacterial survival (Borukhov & Nudler, 2003;Kazmierczak et al., 2005). These sigma factors confer promoter recognition to RNA polymerase, play a key role in DNA strand separation and then dissociate from the core enzyme after transcription initiation. Generally, prokaryotic sigma factors are classified into two main, evidently unrelated categories namely sigma 70 and alternate/secondary sigma factor (Helmann & Chamberlin, 1988). In bacteria, the primary sigma factors transcribe the house-keeping genes while the alternate sigma factors plunge into action for regulating certain specific set of genes to adapt itself to harsh and varied environmental conditions (Lonetto et al., 1992).
One such alternate sigma factor is ComX of S. pyogenes which aids in identifying CIN box (TACGAATA) promoter sequence and specifically transcribes ComX dependent genes in the competence pathway of the bacterium. This ComX facilitates the horizontal gene transfer in S. pyogenes by identifying CIN box promoter for DNA uptake and recombination mechanism to take place (Mashburn-Warren et al., 2012;Opdyke et al., 2001). Hence, in order to understand the mode of interaction between the alternate sigma factor, core RNA polymerase subunits and CIN box promoter, three dimensional models of these proteins and the promoter sequence were generated and in depth structural analysis has been performed. Despite all the subunits of core RNA polymerase having a great significance in transcription, our study has mainly focused on beta and beta prime subunit which is involved in direct DNA binding (Nomura et al., 1984;Yura & Ishihama, 1979).

Sequence analysis of RpoC, RpoB and ComX
The amino acid sequences of RNA polymerase beta and beta prime subunit (RpoC, accession no: NP_268496.1, RpoB accession no: WP_010921796.1) and alternate sigma factor (ComX, accession no: AYO93378.1) of S. pyogenes were retrieved from NCBI database. Sequence similarity search for RpoC, RpoB and ComX was performed using PHI BLAST against non-redundant sequence database (Altschul et al., 1990). The top five similar sequences were aligned using multiple sequence alignment by Clustal Omega server (Sievers et al., 2011).The conserved regions of RNA polymerase and alternate sigma factor were identified by Conserved Domain Database (CDD) in NCBI (Marchler-Bauer et al., 2017). The secondary structure of proteins was analysed using Self Optimized Prediction Method with Alignment (SOPMA) (Geourjon & Deleage, 1995) tool. The motif predictions for alternate sigma factor were carried out using MOTIF Search server.

Model building for RpoC, RpoB, ComX and CIN-DNA sequence
The crystal structures for RpoC, RpoB and ComX were not available and hence three dimensional models were generated by homology modelling using Modeller 9.11 (Webb & Sali, 2017). The templates for RpoC and RpoB were chosen by structure similarity search using protein BLAST server against PDB database. The best matches chosen as templates for b' subunit were b' subunit from Thermus aquaticus (PDB ID: 2GHO_D) and Escherichia coli (PDB ID: 3LU0_D-55% identity with 100% query coverage and an e-value of 0). Similarly RpoB protein was modelled by choosing b subunit from Mycobacterium smegmatis (PDB ID: 5VI5_C-60% identity with 100% query coverage and an e-value of 0) as template from PDB. In case of ComX, comparative modelling was performed by @TOME (Automatic Threading Optimization Modelling and Evaluation) v3 Platform (Pons & Labesse, 2009) which uses modeller to build models. The template chosen for model building was SigH from Mycobacterium tuberculosis (PDB ID: 6JCX_F). The best models were chosen with help of DOPE score and its stereochemical properties were validated using Ramachandran plot. The CIN-DNA (TACGAATA) sequence was modelled using COOT software module (Emsley et al., 2010).

Molecular dynamics simulation of modelled RpoC,
RpoB, ComX and CIN-DNA Molecular dynamics simulation was performed for modelled structures using GROMACS 2018 software (Hess et al., 2008;Van Der Spoel et al., 2005). Models of RpoC, RpoB, ComX and CIN-DNA were simulated separately in a solvent system by SPC216 water model in a cubic box. The dimensions of the box from the edge were kept as 1.0 nm distance from the centre for each protein. The net charges of the protein and DNA molecule in the solvation system were nullified by incorporating Cl and Na ions accordingly. Energy minimization was carried out for 50,000 steps for both protein and DNA by steepest descent algorithm with tolerance limit of 1000 kJ/mol/nm. The interactions like Van der Waals force and electrostatic interactions fast Particle Mesh Electrostatic (PME) (Kawata & Nagashima, 2001) was used with a cut off range of 1 nm. Algorithms such as LINCS (Hess et al., 1997) and SATTLE (Miyamoto & Kollman, 1992) were used to constrain bond angle and geometry of water molecules.
Temperature and pressure were regulated by modified weak coupling Berendsen thermostat, V-rescale and Parrinello-Rahman (Martonak et al., 2003) method respectively. The position restraints using pressure and temperature (NPT); volume and temperature (NVT) were studied for 100 ps to check for equilibration of the system. The protein and DNA structures were subjected to production molecular dynamics simulation for 50-100 ns time scale.

The PCA and FEL analysis
The Principle component Analysis (PCA) was used to depict protein high energy profile through their corresponding trajectories with help of eigenvectors obtained from covariance matrix constructed based on atomic fluctuation of the protein molecule (Amadei et al., 1993). The atomic fluctuations of protein molecules are defined by its dihedral angles (U, W) during simulation and this was used to define cosine content (ci) of principal component (pi) of covariance matrix. The Free Energy Landscape (FEL) defined from PCA represents sufficient intervals from trajectories and chosen using the cosine content value. The values of cosine content ranges from 0 to 1 for a given simulation time period T.
While the cosine content value near to 1 represents the large motion of simulated protein, value range between 0.2-0.5 represents smooth and single basin (Maisuradze & Leitner, 2007). Thus the projection eigenvectors of RpoC, RpoB, ComX and CIN-DNA were generated and their cosine content values were analysed. The PCs having cosine content value less than 0.2 were chosen as PC1 and PC2 for obtaining FEL. The FEL analysis was used to select the models for docking studies.

Docking studies of RpoB-RpoC-ComX-CIN-DNA
The CIN-box promoter region possesses the CIN-DNA sequence for sigma factor ComX to bind. Hence RpoC, RpoB, ComX and CIN-DNA were docked using the High Ambiguity Driven Bio-molecular Docking (HADDOCK) (de Vries et al., 2007;Dominguez et al., 2003) interface. Initially, ComX was docked with the CIN-DNA, and then the complex was again docked with RNA polymerase RpoC and RpoB subunits. The best cluster after docking was analysed using RMSD cut off and HADDOCK score. Interactions and interface residues between docked complexes were analysed using PDB-E Pisa server (Krissinel & Henrick, 2007).

Molecular dynamics simulation of RpoB-RpoC-ComX-CIN-DNA complex
The best docked complex was chosen for molecular dynamics to analyse the stability of RpoC-RpoB-ComX-CIN DNA interaction. The model was subjected to production MD for 75 ns and FEL analyses was performed to obtain lowest energy minimum structure. Further structure analysis was performed using GROMACS 2018 software.

Result and discussion
3.1. Sequence and structure analysis of beta subunit RNA polymerase RpoB The RpoB model was built by homology modelling using b subunit from M. smegmatis (PDB ID: 5VI5_C) ( Figure 1A) as template which showed 60% identity and the best model was chosen using DOPE score. The conserved domains namely rpoB (Gly3-Glu1147) and PRK00566 super family (Ile286-His326) ( Figure 2B) were identified using CDD database and conserved regions A-I annotated in homologous structure were identified in RpoB (Sweetser et al., 1987) ( Figure 3). Apart from conserved regions, certain non-conserved lineage specific inserts such as b-b' Module 2 (BBM2) known to occur in Mycobacterium are observed in RpoB (Iyer et al., 2004). The b-b' subunit interface residues were analysed from homologous structure T. aquaticus (Zhang et al., 1999) where b regions H and I interacts with b' regions C, D, G and H respectively and these binding regions were employed while docking b and b' subunit.
The beta subunits of RNA polymerase are attractive targets for various well known inhibitors, such as rifampicin. The rifampicin binding regions are classified into four clusters (Lisitsyn et al., 1984) and mutation in these regions contribute to drug resistance in clinical isolates in M. tuberculosis (Kapur et al., 1994). In GAS RpoB, identical rifampicin sensitive clusters [cluster I (Lue471-Lue493),cluster II (Pro524-Lue535)] and single amino acid residues (Val135 and Arg649) were identified and found identical to Mycobacterium ( Figure 4). Another notable region in the template structure of MTb which binds to ECF sigma factor region4 during promoter recognition is beta flap tip helix (bFTH)  and beta C terminal helix (bCTH) regions . A similar beta flap helix domain (Glu847-Glu867) and beta C terminal domain (Asp1154-Asp1166) were identified in RpoB of GAS.

Sequence and structure analysis of beta prime subunit RNA polymerase RpoC
The three dimensional model of RNA-polymerase b' subunit RpoC was generated by homology modelling using Modeller 9.11 tool which yielded twenty models and the best model was selected based on highest DOPE score ( Figure 1B). Based on sequence similarity to homologous structures, a major conserved PRK 566-DNA directed RNA polymerase domain (Asp3-Glu1195) was identified in RpoB lineage specific insert sequences, namely b' insertion 6 (b'i6) in E. coli and b' insertion 2 (b'i2) in T. aquaticus was absent in S. pyogenes (Chlenov et al., 2005). During transcription process, active site of the elongation complex confers structural change by altering trigger loop into a helix, forming three helical bundles along with bridge helix and formulates stable insertion state of the nucleotides (Vassylyev et al., 2007;Wang et al., 2006). In RpoC, the bridge helix residues (Val774-Gln810) in region F and trigger loop/ helix region (Val920-Gly932 and Gly954-Phe962) were identified, revealing structural similarity to template E. coli (Tuske et al., 2005).
The process of polymerisation forms a DNA-RNA hybrid upstream of the secondary channel which is separated from downstream DNA with the help of rudder region. Subsequently, RNA cleavage from DNA-RNA hybrid sequence is facilitated by the lid region. The lid region in RpoC (Val234-Arg261) displays structural homology to E. coli (Val244-Val272) (Toulokhonov & Landick, 2006). Similarly the rudder region (Met288-Gln330) enclosing region C manifests a homologous fold in comparison with both T. aquaticus (Met282-Gln325) and E. coli (Met298-Met330). Two key residues in S. pyogenes (Arg312 and Lys315) essential for separation of RNA from DNA-RNA hybrid were found to be identical to template E. coli (Arg322 and Lys325) and T. aquaticus (Arg307 and Lys309) rudder regions (Korzheva et al., 2000;Zhang et al., 1999).
The RNA polymerase holoenzyme is formed only when the dissociable sigma factor binds to the core subunit and identifies the promoter region. Previous studies report that coiled-coiled domain of b' subunit in E. coli (Phe260-Asn309) (Arthur & Burgess, 1998) N domain interacts with primary sigma factor. In RpoC similar coiled-coiled domain (Arg249-Asp298) was identified in the modelled RpoC and same binding site residues were employed for docking of alternate sigma factor ComX.
Apart from these structurally important regions, certain sequence specific domains were analysed in both structures. The Zn binding motif (CX6CX2C) comprises of four cysteine residues in which the fourth cysteine residue is generally observed 82 amino acid away from the motif region and said to play a major role in b' folding (Markov et al., 1999). A similar sequence arrangement was observed in RpoC (Cys 819,893,900&903) between region F and G. The catalytic active centre (NADFDGD), a well conserved region in both prokaryotes and eukaryotes was identified in RpoC (Asn448-Asp454) and located in region D. This encompasses the key residues D448, D450 and D452 which plays a major role in chelation of Mg 2þ ion (Joyce & Steitz, 1994;Zaychikov et al., 1996).
The formation of an Open Promoter Complex (OPC) during transcription process necessitates an interaction between sigma factor with core RNA polymerase and the promoter  region ( Figure 1D). In SigH, region2 binds to coiled-coiled domain of b' subunit, region 4 binds to bFTH, and the À10 promoter region are recognised by specificity loop. Similar binding regions were identified and employed for docking in ComX. The superposition of the model of ComX with its template SigH is shown in Figure 5.

Structural stability of proteins
The generated models of RpoC, RpoB and ComX were subjected to 50 ns and 100 ns molecular dynamics simulation to assess their structural stability. The Root mean Square Deviation (RMSD) profile, The Root Mean Square Fluctuation (RMSF), FEL Analysis and Maximum residual displacement were calculated for all three proteins.

RNA polymerase beta prime subunit RpoC
The RMSD profile with respect to backbone fluctuation was restricted and equilibration was obtained between 12-30 ns and 45-50 ns ( Figure 6). The RMSF plot for analysing the residual fluctuations of RpoC showed a deviation of 0.25-1.12 nm ( Figure  6E). The b' E region comprising a loop region showed maximum fluctuation about 1.12 nm whereas b'D region containing active site motif was minimal of 0.25 nm. The regions b'F, b'G, b'H and b'A displayed variation of 0.5-0.6 nm while b'C exhibited 0.8 nm.    This illustrates the wobbly nature of protein mainly due to the presence of many loops.
Further structural deviations between the native and simulated structure were investigated to recognise major changes in overall orientation of the protein. The simulated structure of RpoC was chosen at 30 ns time period from FEL plot (Figure 7). Maximum residual displacement was calculated by generating axes for selected residues. In RpoC both N terminal residues (Met1-Gly93) and C terminal residues (Gln1189-Ser1203) showed deviation of 10.3 Å and 10.7 Å respectively. The region between Asp495 to Ala655 also showed a deviation of 5.7 Å, while the binding interface of ComX recorded a deviation of 0.2 Å.

RNA polymerase beta subunit RpoB
The RMSD graph of RpoB showed steep increase but stabilised at 25 ns to 40 ns. No greater deviations were observed further, at the end of 50 ns depicting a well equilibrated structure ( Figure 6A). The FEL analysis indicated three low energy clusters and the lowest energy structure was chosen at 49 ns ( Figure 8). The RMSF fluctuations were analysed regionwise ( Figure 6D), where bC and bG region showed 1 nm deviance, whereas region bF and bH exhibited 0.6-.05 nm fluctuation. Remaining residues seemed to be well stabilised without much fluctuation during simulation.
The residual displacement measured between the native and simulated structure at the N and C terminal residues exhibited 6.3 Å and 17.6 Å deviations which were relatively very high. The ComX binding region and the b-b' binding regions H and I showed displacement about 2-3 Å distance respectively.

The alternate sigma factor ComX
The RMSD profile with respect to backbone clearly showed that ComX attained its structural stability between 50-100 ns ( Figure 6B). The RMSF graph expedition shed light on highly fluctuating residues with deviation between 0.5-2.0 nm. Residues in region 4 showed deviation from 1-1.5 nm ( Figure  6C). The region2 and specificity loop regions exhibited lesser deviations in the range of 0.5-0.6 nm. The structure chosen for further analysis from FEL was at 61 ns ( Figure 9) and structural changes were evaluated. In simulated structure, changes in helix region were observed; the a3 helix was broken into two separate helices from Leu80-Ser101 and Glu113-Ala120 while a4 helix completely dissociated to a loop. Thus simulated structure contained a1 (Met23-Tyr45), a2 (Leu49-Gln45), a3 (Leu80-Phe88), a4 (Asn94-Ser101), a5 (Glu136-Gln144) and a6 (Gln148-Phe153) helices respectively. The maintenance of majority of helical structure of ComX during simulation indicated the reliability of modelled structure. Maximum residual displacement was observed between N-terminal (Met1-Ile38) and region4 (Ser147-Leu183) residues. These structural deviations with respect to native structure highlight the changes that protein has undergone during dynamics simulation.

Understanding the mechanism of DNA melting by RpoC-RpoB-ComX
The interactions of RpoC-RpoB-ComX with single and double stranded CIN promoter region has been studied by docking and molecular dynamics simulation. The best docked complex structure was chosen with help of Haddock score and other favourable interactions. The docked protein-DNA complex was subjected to 75 ns production molecular dynamics simulation and stability of interactions between the complexes was analysed. The active site residues were employed for docking; initially ComX (region2 and region4) and CIN promoter region (À9 to À16) were docked followed by docking with complex of RpoC (region B, C, D, G and H) and RpoB (region H and I).
The RpoC-RpoB-ComX complex was docked to double stranded CIN promoter region to substantiate the actual binding nature of sigma factor before melting process. Overall, three protein chains and DNA oligonucleotide were complexed to spotlight transcription initiation complex assembly in S. pyogenes.

Closed promoter complex (CPC)
The holoenzyme establishes interaction with double stranded promoter DNA forming a closed promoter complex ( Figure  10A). The holoenzyme-cin DNA docked trimer complex was analysed for favourable interactions. The DG energy of RpoC-RpoB docked complex was 0.4 kcal/mol, whereas between double strand (ds) CIN-DNA and ComX showed value of À16.9 kcal/mol. The b-b' interface formed 30 hydrogen bonds and 9 salt bridges (Table S1, supplementary material), interaction between b and ComX exhibited 12 hydrogen bond and 4 salt bridges (Table S2, supplementary material), while b' and ComX showed 7 hydrogen bonds and 3 salt bridges (Table S3, supplementary material). The interaction between ComX residues Lys42, Arg43, His44, Arg79, Lys85 and Ser89 with ds CIN DNA exhibited 6 hydrogen bonds ( Table S4, supplementary material) with the non-template (nt) strand À12, À13, À14 and À17 residues. These interactions shed light on the binding pattern of sigma factor, necessary for melting the promoter complex.

Open promoter complex (OPC)
The open promoter complex indicates conversion of double strand CIN-DNA to single strand ( Figure 11B) for further initiation of the transcription. Introspection of the DG value of the complex indicated that energy between RpoC and RpoB was À0.4 kcal/mol, while between ComX and CIN-DNA was at À11.3 kcal/mol. In OPC, the b-b' interface exhibited 36 hydrogen bonds with 12 salt bridges (Table S5, supplementary material), b and ComX showed 4 hydrogen bonds and 1 salt bridge (Table S6, supplementary material), while b' and ComX interacted through 14 hydrogen bonds and 2 salt bridges (Table S7, supplementary material). The residues of ComX region2, namely Glu26, Lys33, Leu41, Lys42, His44, Tyr46, Trp55 and Tyr81 interact with single stranded DNA exhibiting 9 hydrogen bonds (Table S8, supplementary  material) with À7, À8, À14, and À15 region of nt strand ( Figure 11). These key residues play a major role in proteinprotein and protein-DNA complex formation and the stability of complex formed was further ascertained by MD simulation.
3.6. Molecular dynamics simulation of proteinpromoter complexes 3.6.1. Closed-promoter complex (CPC) The CPC was subjected to 75 ns production MD simulation. The lowest energy structure from FEL was chosen at 63 ns for further analysis (Figure 12). RpoC and RpoB bound with energy of À4.2 kcal/mol while CIN-DNA and ComX showed a lower value of À4.4 kcal/mol. The major structural changes at protein-DNA binding interfaces were observed between the simulated and native structure. In b-b' binding region, maximum deviation of 6.5 Å was observed within bH domain. The b' regions showed minimum deviations ranging from 0.7-2.2 Å distance respectively. ComX region2 binding to CIN-DNA alone showed a gross deviation of 19.2 Å whereas binding to b domain bFTH and b' coiled-coil domain exhibited minimum deviations of 4.8 Å and 0.8 Å respectively. The structure of CIN-DNA showed 13 Å deviation, illustrating a shift in binding pattern for favourable interactions between the complexes. The CPC complex showed following hydrogen bonds and salt bridges: 27 and 8 between b-b' (Table S9, supplementary material), 1 and 1 between b and ComX (Table S10, supplementary material), 9 and 2 between b' and ComX (Table S11, supplementary material); while ComX and CIN DNA showed 4 bonds (Table S12, supplementary material). Only few active site interactions were found maintained throughout the simulation. In RpoC-RpoB complex, six residual pair of hydrogen bonds namely, Glu763-Glu392, Glu763-Arg389, Asp939-Arg389, Asn1024-Lys435, His1045-Ser311and Asn1086-Arg30 were found to be intact. Similarly in ComX-b Glu853-Arg165 and ComX-CIN DNA Arg43-14Cyt, Ser89-12Ade of non-template strand was maintained throughout MD simulation. Apart from active site residues, other neighbouring residues of RpoC-RpoB, ComX and CIN-DNA regions take part in hydrogen bond interaction, rendering a stable complex initially which aids ComX to bind the promoter regions and subsequently to initiate the melting process. The change in orientation of CIN-DNA clearly illustrates the change in interaction pattern which could be essential for further promoter melting.

Open promoter complex (OPC)
The RMSD graph of OPC displayed the stability of complex and single strand CIN-DNA during simulation. The lowest energy structure at 71 ns ( Figure 12) was obtained and further analysed for its structural stability and maintenance of favourable interactions. A comparison of DG energy value of complexes showed that the energy level between RpoC and RpoB was À0.3 kcal/mol, whereas the ComX and CIN-DNA energy was À4.1 kcal/mol. Structural drifts in b-b' were observed; the binding regions of bH and bG region displayed 8.1 Å and 5.5 Å deviation respectively, whereas in b' binding regions, minimum deviation was observed from 0.1-4.1 Å. The ComX region4 exhibited deviation greater than 15 Å ( Figure 13) and ComX-b binding domain showed higher displacement of 14 Å. Interestingly, these structural drifts did not affect the orientation of the docked complex except a small structural variation in single strand CIN-DNA of 3.8 Å.
The simulated OPC displayed following hydrogen bonds and salt bridges: 18 and 8 between b and b' (Table S13, supplementary material), 5 and 2 between ComX and b (Table S14, supplementary material), 5 and 2 between ComX and b' (Table S15, supplementary material); 5 hydrogen bonds between ComX-CIN DNA (Table S16, supplementary material). The key interactions maintained throughout the simulations between ComX and CIN-DNA were Lys33-7Ade of nontemplate strand while between RpoC and RpoB were Glu853-Arg37, Asp939-Arg389 and Arg1017-Glu392 representing the stable nature of the complex.

Melting mechanism of ComX
The primary sigma factor binding to promoter region is facilitated by four regions namely region1, 2, 3 and 4. Alternate sigma factors, although lack these regions, mimic the function of the former. Promoter melting is commenced by flipping out a nucleotide and inserting the flipped out base to the nearest protein pocket. The flipped out bases of DNA are stacked mostly by aromatic amino acids and are stabilized by non-covalent interactions. This kink enables strand separation, thus aiding in initiation of transcription process (Saecker et al., 2011). However, promoter recognition by primary and alternate sigma factors is sequence specific and very stringent (Campagne et al., 2014). The CIN promoter motif 5 0 -T 16 -A 15 -C 14 -G 13 -A 12 -A 11 -T 10 -A 9 -3 0 recognised by ComX is likely to undergo a slightly different mechanism for initiation of transcription. To understand promoter recognition system in context of ComX, the protein was docked with double and single strand CIN promoters and subsequent simulation results were analysed.
The representative structure of double strand CIN-DNA complex with ComX clearly elucidates clamping towards double strand-CIN promoter region. In SigH the template strand residues take part in interaction in subsequent isomerization steps. In our simulation studies, such template strand interactions were not observed in the double strand complex. The CIN-DNA residues G -8, A -12 , G -13, C -14, T -17, G -19 interact with ComX region2 at Lys42, Arg43, His44, Arg79, Lys85, Ser89, Ser2 and Lys93. Apart from these, NH-p interactions observed between Tyr91 residue of ComX and T -10 and A -11 at a distance of 5.4 Å ( Figure 14A). Hence ComX unveils interaction for weakening the double strand DNA for initiation of promoter melting mechanism.
In the alternate sigma factor SigH from M. tuberculosis whose structural details are well elucidated, the specificity loop region recognises À10 promoter element and promoter melting residues Tyr90, Tyr 94, Asn88 and Ile85 were identified. Similar promoter melting residues were observed in OPC, where single strand CIN-DNA T -10 region does not form hydrogen bond with ComX, but notable NHp interactions with Tyr81 residue was seen for both T -10 and A -11 at a distance of 5 Å and 5.3 Å ( Figure 14B). The CIN-DNA regions A -7 , G -8 , A -9, A -12 , C -14 , and A -15 form hydrogen bonds with region2 residues of ComX namely Met1, Lys33, Arg43, Lys85, GLu26, Leu41, His44, Tyr46, Trp55 and Tyr81 in which the Lys33-A -7 was only interaction maintained throughout 75 ns production MD simulation period. Hence these interactions make this complex very stable and could contribute to the stability of the unwound form of CIN-DNA by base stacking.
Thus, despite low degree of sequence conservation between ComX and other alternate sigma factors, similar interaction pattern comprising of charged, neutral, hydrophobic and non-polar residues might contribute to promoter melting mechanism. However, the mode of binding and structural deviations in protein-DNA binding interfaces might render ComX to follow a slightly different mechanism of DNA melting, in concurrence with previous reports which state that each alternate sigma factor follows a unique Figure 11. A close-up view of the mode of interaction between ComX (violet) and single strand CIN-DNA (light blue). The key residues at the interface of ComX and DNA are indicated and hydrogen bond with the distances is marked. mechanism for promoter melting. Our study clearly exhibits that ComX, with only two domains has capability to initiate the melting of DNA for subsequent transcription process.

Conclusions
Transcription initiation process employing alternate sigma factors in S. pyogenes has not been elucidated so far. In this study, the RNA polymerase b' subunit RpoC, b subunit RpoB, the dissociable alternate sigma factor ComX and the promoter region CIN-DNA were modelled by homology and comparative modelling. The modelled b' RpoC exhibits important structural regions like trigger loop/helix, lid, rudder, zinc binding motif, bFTH, bCTH and catalytic active site regions similar to the template structures. The alternate sigma factor ComX was modelled by comparative method and found to have 2 domains. Transcription initiation was studied by docking RpoC-RpoB-ComX with single strand CIN-DNA complex and double strand CIN-DNA complex and subsequently MD simulations. The ComX region2 wraps around the double strand CIN-DNA, illustrating a tight anchoring before melting process. In conclusion, ComX complex exhibits a slightly different binding mode where mostly hydrophobic amino acids take part and might display a distinctive mechanism of promoter melting. The single strand complex obtained did not show any interaction in T -10 and A -11 regions exemplifying possible flipping out and base stacking manifested by ComX as observed in template structure. Thus the promoter recognition and its subsequent melting by alternate sigma factor could highly selective and specific, based on environmental cues; understanding its mechanism helps us to target any inhibitory pathway to prevent the pathogenesis of the bacteria.  The interaction between ComX (green) and the DS CIN-DNA (cyan) after MD simulations of RpoC-SigX-DS CIN DNA complex. The NH-p interaction formed between Y91 residues and T -10 (nt) is clearly illustrated. (B) The interaction between ComX (pink) and Single strand CIN-DNA (orange) after MD simulations of RpoC-SigX-SS CIN DNA complex. The NH-p interaction formed between Y81 residues and T -10 and A -11 (nt) is shown.