Structural and functional role of invariant water molecules in matrix metalloproteinases: a data-mining approach

Abstract Matrix metalloproteinases (MMPs) are a family of zinc-dependent endopeptidases known to degrade extracellular matrix (ECM). Being involved in many biological and physiological processes of tissue remodeling, MMPs play a crucial role in many pathological conditions such as arthritis, cancer, cardiovascular diseases, etc. Typically, MMPs possess a propeptide, a zinc-containing catalytic domain, a hinge region and a hemopexin domain. Based on their structural domain organization and substrates, MMPs are classified into six different classes, viz. collagenases, stromelysins, gelatinases, matrilysins, membrane-type and other MMPs. As per previous studies, a set of invariant water (IW) molecules of MMP-1 (a collagenase) play a significant role in stabilizing their catalytic domain. However, a functional role of IW molecule in other classes of MMPs has not been reported yet. Thus, in this study, IW molecules of MMPs from different classes were located and their plausible role(s) have been assigned. The results suggest that IW molecules anchor the structurally and functionally essential metal ions present in the vicinity of the active site of MMPs. Further, they (in)directly interlink different structural features and bridge the active site metal ions of MMPs. This study provides the key IW molecules that are structurally and functionally relevant to MMPs and hence, in turn, might facilitate the development of potent generalized inhibitor(s) against different classes of MMPs. Communicated by Ramaswamy H. Sarma


Introduction
The degradation of extracellular matrix (ECM) is an essential feature of the development, repair, morphogenesis and remodeling of tissues (Murphy & Nagase, 2008). It is a meticulously regulated process under standard physiological conditions; however, when dysregulated, it leads to various diseases such as Alzheimer's disease, arthritis, asthma, cancer, cardiovascular diseases, chronic ulcers, encephalomyelitis, multiple sclerosis, nephritis and tumor metastasis (Loffek et al., 2011;Hadler-Olsen et al., 2011). One of the major enzymes implicated in ECM degradation is matrix metalloproteinases (MMPs), which are zinc-containing enzymes belonging to a large metzincin superfamily (Bode et al., 1993). MMPs have been reported to be present in animals, archaea, bacteria, nematodes, plants and viruses (Laronha & Caldeira, 2020). The MMPs are regulated by the activation of inhibitory precursor zymogens and tissue inhibitors of metalloproteinases (TIMPs), necessitating a balance between MMPs and TIMPs for the subsequent ECM remodeling in tissues (Visse & Nagase, 2003).
Interestingly, MMPs presumably require a water molecule prior to the substrate or inhibitor binding. This water molecule may be completely displaced upon inhibitor binding or retained to serve as a bridge in some complexes with hydroxamates and carboxylates (Babine & Bender, 1997;Cheng et al., 1999;Gavuzzo et al., 2000). In MMP-8, three water molecules have been shown to play an important role(s) in bridging the active site residues and the inhibitor molecules (Brandstetter et al., 2001;Gavuzzo et al., 2000;Pochetti et al., 2006Pochetti et al., , 2009. Similarly, in MMP-13, three water molecules are known to form structural networks at the proximal sites directly interacting with the ligands. Thus, the displacement of these structural water molecules by a benzyl/methyl group resulted in an improved design of inhibitors (Engel et al., 2005). In MMP-12, the active site contains three water molecules coordinated to zinc ion, one of which is hydrogen-bonded to Glu219 (a general base), in an almost regular octahedral geometry. Further, the water molecule, which is semi-coordinated to zinc, acts as a 'spacer' between Glu219 and the Ile-Ala-Gly fragment (Bertini et al., 2006). Recently, the catalytic domain of human MMP-1 is reported to be stabilized by a set of invariant water (IW) molecules (Chakrabarti et al., 2017). In addition to the structural stability, water molecules play a crucial role in ligand binding and enzyme catalysis (Bairagya & Mukhopadhyay, 2013;Chakrabarti et al., 2017).
Although MMPs are highly similar at the structural level, they possess only a moderate sequence similarity ($50%). Thus, due to their highly flexible active site conformations, designing a specific inhibitor against MMPs is challenging. Interestingly, the majority of the MMP inhibitors (MMPIs) have been designed specifically to chelate the c Zn bound at the structurally conserved but variable active site (Mohan et al., 2016). Although water molecules are an integral part of a protein and play a key role in active site formation (de Beer et al., 2010;Li & Lazaridis, 2007), their role in designing inhibitors against MMPs has not been well-explored. Thus, in this study, IW molecules of MMPs belonging to four different classes, viz. collagenases (MMP-1, À8 and À13), stromelysins (MMP-3), gelatinases (MMP-9) and other MMPs (MMP-12) have been identified. Their localization and interaction(s) with the structurally crucial regions, including metal ions present in MMPs, have been analyzed. Finally, the probable structural and functional role of IW molecules in MMPs has been suggested.

Data collection
The list of MMPs considered in this study was extracted from the UniProtKB database (https://www.uniprot.org/) and their three-dimensional atomic coordinates were retrieved from the RCSB Protein Data Bank (https://www.rcsb.org/) (Berman et al., 2000;The UniProt Consortium, 2021). A total of 206 tertiary structures having 383 protomers (or subunits) were available in PDB at the time of analysis. However, protomers having less than 100 water molecules were excluded from the analysis. Thus, out of 383, a total of 258 protomers were considered for further analysis. The resolution of the selected structures ranges from 1.5 to 2.2 Å (Supplementary material  Table S1). At this resolution range, the position of the water molecules can be recognized with high precision, particularly those present in the first hydration shell of the protein molecule. The structures having two or more chains were separated as functional protomers for further analysis. The proteins are referred to according to their PDB nomenclature throughout the manuscript.

Sequence and structure analyses
A pairwise and multiple sequence alignment (MSA) of MMPs was performed using the Web servers BLAST and Clustal Omega, respectively (Altschul et al., 1990;Sievers & Higgins, 2014). The MSA was further decorated using a web tool ESPript v.3.0 (Gouet et al., 2003). The interactions between protein atoms and water molecules were identified using the program Coot v.0.8.9.1 with a maximum distance cut-off of 3.6 Å (Emsley et al., 2010). All the structural figures were prepared using the program PyMOL (PyMOL Molecular Graphics System, Schrodinger, LLC).

Identification of invariant water molecules
The protein structures with PDB ids 1SU3, 1QIA, 2OY2, 4WZV, 3LJG and 3KEK for MMP-1, À3, À8, À9, À12 and À13, respectively, containing the highest number of water molecules, were chosen as the reference (fixed) structure of their respective MMP class (Supplementary material Table S1). All other structures in their respective MMP class were considered flexible (mobile) structures. In order to identify the IW molecules, the flexible (mobile) structures of each MMP class were superimposed over the reference (fixed) structure of that class employing the module align of the program PyMOL with the help of a home-built shell script. A water molecule of the mobile structure residing spatially within a radius of 1.8 Å to that of the reference structure was considered invariant. A water molecule of the reference structure having a similar pair in all the mobile structures of that MMP class was treated as completely (100%) invariant. Since several water molecules were not found to be 100% invariant due to the presence of various molecules such as ligands, small molecules from crystallization buffers, etc., the water molecules having greater than equal to 90% structural frequency were also included for the analysis. In order to identify the IW molecules among all the classes of MMPs, water molecules having greater than equal to 80% structural frequency were also included. The B factor of each IW molecule was extracted from its respective crystal structure. The solvent-accessible surface area (SASA) of each IW molecule was computed using the program NACCESS v.2.1.1 with a probe radius of 1.4 Å (Hubbard & Thornton, 1993). All IW molecules having a SASA of less than equal to 2.5 Å 2 were considered to be buried.

Molecular dynamics simulation
To reaffirm the spatial conservation of IW molecules with respect to their proteins, all the reference structures (protomers) were simulated, adopting the molecular dynamics (MD) conformational search method employing the freely available package GROMACS v.5.1.4 (Abraham et al., 2015). Before each MD simulation, the crystallographic water molecules were removed from the MMP protomers. The atomic charges and topological parameters were obtained using AMBER03 force field for all the MD simulations (Ponder & Case, 2003). For each MD simulation, the protein was centered in a cubic box generated using the module editconf, keeping a distance of at least 1.0 nm between the edge of the box and the protein atom in each direction. Subsequently, the protein was solvated with a flexible simple point charge (SPC/E) water model using the module genbox.
The simulation system was neutralized by adding the required number of sodium or chloride ions. The system was energetically minimized by performing 50000 steps of the steepest descent method, keeping a force-field cut-off of 1000 kJ mol À1 nm À1 . This was followed by a two-phase equilibration protocol. Firstly, NVT (canonical or isothermalisochoric) ensembles were employed to stabilize the system at 310 K for 1000 ps. Secondly, equilibration of an isotropic pressure coupling of 100 kPa and temperature coupling of 310 K was performed by utilizing NPT (isothermal-isobaric) ensembles for 1000 ps. The temperature (0.1 ps) and pressure (2.0 ps) coupling constants were maintained using the V-rescale temperature coupling and Parrinello-Rahman pressure coupling methods, respectively (Parrinello & Rahman, 1981). The long-range electrostatic interactions were computed using the particle mesh Ewald (PME) algorithm, while the Verlet neighbor list calculation (cut-off: 0.8 nm) was employed to calculate the short-range Van der Waals (VdW) interactions (Darden et al., 1993;Essmann et al., 1995). The equations of motion were integrated at a time step of 2 fs after constraining all the bond lengths using the P-LINCS algorithm (Hess, 2008). Finally, for all the protomers, an MD simulation of 100 ns was performed. To compute the residence frequency (RF) of each IW molecule, the structures extracted at every 1 ns during the MD simulation were superimposed onto the reference structure. The superimposition of structures was performed using the module align available in the program PyMOL. The RF of an IW molecule was considered 100% if it was present in all the 101 structures generated during the MD simulation within a radius of 1.8 Å of the reference water molecule. Analysis of MD-generated structures was performed using a home-built shell script (provided as supplementary material) and the tools available in the GROMACS suite.

Sequence and structural attributes are conserved across MMP classes
To identify the conserved sequence and structural features, a comparative analysis of sequences and structures of all the selected MMPs (À1, À3, À8, À9, 12 and À13) was performed. The result suggests that these MMPs share a sequence identity (query coverage) of 40-60% (70-100%) with each other (Supplementary material Table S2). Most of the MMPs contain an N-terminal signal peptide ($80-90 aa) required for secretion, a propeptide ($80 aa) which keeps MMP inactive, a catalytic domain ($160 aa) containing two zinc and three calcium ions binding sites and a linker region ($15 aa) which connects the hemopexin domain ($200-250 aa) ( Figure 1A). Notably, gelatinases (MMP-2 and À9) contain an insertion of $14 aa of fibronectin type-II domain within the catalytic domain ( Figure 1A). Few MMPs (-14, À15, À16 and À24) are found to possess an additional transmembrane domain with a small cytoplasmic C-terminal domain (Laronha & Caldeira, 2020). A structure-based MSA of the catalytic domain shows that zinc-binding motif, catalytic site and the Met-turn region are well conserved among these MMPs ( Figure 1B). The zinc-coordinating residues (Cys92, His168, Asp170, His183, His196, His218, His222 and His228; the residue numbering is as per MMP-1) are conserved ( Figure 1B). Similarly, the calcium-coordinating residues (Asp124, Ala157, Asp158, Asp175, Gly176, Gly178, Asn180, Gly190, Gly192, Asp194, Asp198, Glu199 and Glu201) are also almost conserved across all the MMPs. The presence of a fibronectin type-II domain-containing loop (hereafter referred to as conserved loop region (CLR) for MMPs other than gelatinases) known only in gelatinases (Roeb et al., 2002) is conserved in other MMPs as well ( Figure 1B). A typical structure of MMP comprises an N-terminal propeptide, a catalytic domain, a variable linker region and a hemopexin (Hpx) domain ( Figure  1C). The structural features like the conserved loop, the metal binding and Met-turn regions are spatially conserved ( Figure 1B-D). In summary, MMPs belonging to different classes possess significant sequence and structural similarities among themselves. The active site cleft is bordered by b-strand IV, helix B, and a stretch of random coil adjacent to the carboxy terminus of helix B. The c Zn is at the bottom of the cleft ( Figure 1D). Further, the s Zn interacts with an extended loop (L4) between the b-strands III and IV ( Figure 1D).

Invariant water molecules contribute to the structural folding in collagenases
To understand the role of water molecules in collagenases, invariant water (IW) molecules present in MMP-1, À8 and À13 (hereafter referred to as M1IWn, M8IWn and M13IWn, respectively, where n represents the n th IW molecule) were identified; due to the selection criteria of structures, MMP-18 was excluded from the analysis. In MMP-1, À8 and À13, there are 7, 23 and 20 IW molecules identified, respectively (Supplementary material Tables S3-S5).
In MMP-1, although all the seven IW molecules are located in the vicinity of the catalytic domain, none of them are directly involved in interacting with the active site residues and metal ions. Instead, the IW molecules are situated on the backside of the active site stabilizing the loops. These seven IW molecules (M1IW1, M1IW2, M1IW3, M1IW4, M1IW5, M1IW6 and M1IW7) interact with the residues Arg202, Thr204, Phe207, Arg208, Glu209, Tyr210, Leu212, His213, Arg214, Ile232, Ala234, Tyr240, Thr241, Val246, Leu248, Asp251 and Asp252 of the conserved loop and Metturn regions (Figure 2A and Supplementary material Table  S6). Two of these residues, Phe207 and Tyr240, hold the two loops L7 and L8 through the IW molecule M1IW1 ( Figure  2A). The residues Tyr240 and Thr241 are known to stabilize the substrates in the active site of the enzyme. Interestingly, the residues Phe207 and Arg208 correspond to the position where an insertion is found in MMP-9 ( Figures 1A and 2A). The IW molecule M1IW6 anchors the residue Arg214 ( Figure  2A), which is known to define the depth of the substratebinding pocket and to determine the substrate specificity, has previously been reported to be stabilized by a water molecule (Brandstetter et al., 2001). Notably, all these IW molecules are almost buried with low B factors  Table S3). However, in MD simulation, these IW molecules have a low residence frequency (Supplementary material Table S3).
In MMP-8, out of 23, five IW molecules (M8IW1, M8IW2, M8IW3, M8IW4 and M8IW8) are located in the vicinity of the active site stabilizing the loops. Interestingly, IW molecules M8IW1, M8IW2 and M8IW4 directly anchor the s Zn and calcium ( 1 Ca and 2 Ca) ions ( Figure 2B). The IW molecules M8IW4 and M8IW18 anchor the residues Asp173 and Asp137, respectively, which in turn coordinate the calcium ion ( 1 Ca) (Supplementary material Table S6). The IW molecule M8IW1 stabilizes two residues Arg145 and Glu179, which form a salt bridge. The IW molecules M8IW2 and M8IW4 stabilize a segment (147-151) of the loop L4 between the bIII and bIV strands and the residue His175, that coordinate the s Zn ( Figure 2B). The IW molecules M8IW1 and M8IW3 anchor the residues Gly155, Asp177, Ala178 and Glu180, which in turn coordinate the calcium ion ( 2 Ca) ( Figure 2B). Another six IW molecules (M8IW5, M8IW6, M8IW7, M8IW9, M8IW10 and M8IW11) are situated at the backside of the active site stabilizing the secondary structural elements of the protein ( Figure 2C). One of these IW molecules, M8IW9 interacts with two leucine residues (119 and 229), which form the hydrophobic pocket at the base of the substrate binding and the latter residue shows a large conformational change in the inhibitor-bound protein (Pochetti et al., 2009). Two residues Glu110 and Arg130, form a salt bridge and are stabilized by two IW molecules M8IW5 and M8IW11 ( Figure 2C and Supplementary material Table S6). The loop L6 is stabilized by the IW molecules M8IW3, M8IW7, M8IW12, M8IW17, M8IW19 and M8IW22. Two residues Leu191 and Phe192, stabilize the residue Tyr227, which acts as a selective gatekeeper for the substrate/inhibitor binding. The IW molecules M8IW6 and M8IW16 are involved in stabilizing the residues Ala206, His207 and two consecutive b-turns (210-216), which form the scaffold for the ligand binding. The segment (219-229), which shows a large conformational change upon the ligand binding (Pochetti et al., 2009), is stabilized by the IW molecules M8IW9 and M8IW17 (Supplementary material Table S6). Out of the remaining twelve IW molecules, five (M8IW14, M8IW17, M8IW18, M8IW19 and M8IW22) are involved in anchoring the conserved loop and Met-turn regions (Supplementary material Table S6). Most of these 23 IW molecules are buried and demonstrate low residence frequency during the MD simulation (Supplementary material Table S4).
In MMP-13, out of 20, four IW molecules (M13IW3, M13IW4, M13IW5 and M13IW6) are placed in the vicinity of the active site anchoring calcium ( 1 Ca, 2 Ca) and s Zn ions through the residues Glu171, His172, Asp174 and His200 ( Figure 2D). Another four IW molecules (M13IW1, M13IW2, M13IW7 and M13IW8) stabilize the secondary structural elements forming the scaffold for the ligand binding ( Figure  2D). Interestingly, all these eight IW molecules are buried and show low residence frequency during MD simulation (Supplementary material Table S5). Out of the remaining twelve IW molecules, two (M13IW9 and M13IW15) interact with the residues Gly196, Asp202 and Glu205, which anchor the calcium ions (Supplementary material Table S6). Similarly, another three IW molecules (M13IW13, M13IW15 and M13IW19) form hydrogen bonds with the residues Asp204, Glu205, Thr206, Thr208, Ser209 and Asn215 from the conserved loop and Met-turn regions (Supplementary material  Table S6).
Notably, some of these IW molecules are common among the collagenases. An IW molecule, M1IW1 (M8IW17), interlinks the residues Phe207 (Ser186 in MMP-8) and Glu209 (Asn188 in MMP-8) from the conserved loop region with Tyr240 (Tyr219 in MMP-8) of the Met-turn region ( Figure 2E). Surprisingly, in MMP-1, a water molecule M1HOH979 (M8IW4 and M13IW4) helps in bridging the 1 Ca ( 2 Ca in MMP-8) and s Zn ions by making interactions with the residues Asp194 (Asp173 and Asp198 in MMP-8 and À13, respectively) and His196 (His175 and His200 in MMP-8 and À13, respectively) ( Figure 2F). In contrast to MMP-1 and À8, in MMP-13, the residue Tyr244 of the Met-turn region does not interact with any IW molecule but instead interacts directly with the residue Asn215 of the CLR for its stability ( Figure 2E).

Invariant water molecules stabilize the critical structural features, including the catalytic domain in stromelysins
Stromelysins are involved in the catabolic process of cartilage proteoglycan aggregate structures and regulate other MMPs by activating them (Ra & Parks, 2007). Unlike MMP-1, the role of IW molecules as an integral part of MMPs has not yet been explored for stromelysins (MMP-3, À10 and À11). In this study, MMP-10 and À11 could not be included due to the presence of low number of water molecules in their structures. For MMP-3, a total of 16 protomers were analyzed and a total of nine water molecules were identified to be invariant (hereafter referred to as M3IWn, where n The interacting amino acid residues are shown in blue lines, while the calcium and zinc ions are shown as magenta and green spheres. The residues Asp177 and His179 interacting with the IW molecule (cyan sphere) along with 2 Ca (magenta) and s Zn (green) ions are shown as a blue stick model.
represents the n th IW molecule). Out of nine, five IW molecules (M3IW1, M3IW4, M3IW5, M3IW6 and M3IW7) are in the vicinity of the calcium ( 1 Ca, 2 Ca and 3 Ca) and the s Znbinding sites ( Figure 3A and B). These IW molecules stabilize the metal ion-coordinating residues (Asp141, Gly159, Gly173, Asp177, Asp181 and Glu184) ( Figure 3A and B). Moreover, the IW molecule M3IW6 interlinks the metal ions 2 Ca and s Zn forming a bridge between them ( Figure 3A). Further, IW molecules M3IW1 and M3IW7 form hydrogen bonds with the residues Asp181, Asp183, Gln185 and Thr193 from the CLR, bringing them in close proximity, which results in stabilizing the loop region ( Figure 3A and B). Another two IW molecules (M3IW8 and M3IW9) stabilize the catalytic domain of the protein MMP-3 by anchoring the residue Phe210 ( Figure 3B). Similar to MMP-13, in MMP-3 also, the conserved loop and Met-turn regions are interlinked directly ( Figure 3B). Five of these nine IW molecules are buried and show more than 60% of residence frequency during the MD simulation (Supplementary material Table S7). In addition, these IW molecules are also thermally very stable, with an average B factor of 18 Å 2 (Supplementary material Table S7).

Invariant water molecules confer stability to gelatinases
Gelatinases (MMP-2 and À9) have been reported to play a central role in tumor angiogenesis and the formation of metastasis (Malla et al., 2008). However, the role of the structural water molecule(s) as an essential part of MMPs has not yet been explored for gelatinases. The protein MMP-2 was excluded from the analysis due to the lack of a sufficient number of water molecules in its structure. In MMP-9, using 33 protomers, a total of  28 water molecules (hereafter referred to as M9IWn, where n represents the n th IW molecule) were identified to be invariant (Supplementary material Tables S1 and S8). Out of 28, eight IW molecules (M9IW1, M9IW5, M9IW7, M9IW10, M9IW11, M9IW16, M9IW17 and M9IW25) form a water network-like structure to hold the calcium ( 1 Ca-3 Ca) and the s Zn ions ( Figure  4A). Interestingly, the calcium ( 1 Ca) and the s Zn ions are held together through the IW molecule M9IW7 anchored by the residues Asp201 and His203 ( Figure 4A). Another IW molecule M9IW28 holds the two loop regions, viz. the N-terminal loop and L5 ( Figures 1D and 4A and Supplementary material Table  S6). Another set of 10 IW molecules (M9IW2, M9IW3, M9IW4, M9IW8, M9IW13, M9IW14, M9IW18, M9IW20, M9IW23 and M9IW26) are located at the backside of the active site in a cluster stabilizing the fibronectin type-II domain-containing loop ( Figure 4B and Supplementary material Table S6). Furthermore, the conserved loop and the Met-turn regions are held together through the residues Ser219 and Tyr248, respectively, stabilized by two IW molecules, M9IW15 and M9IW24 ( Figure 4B and Supplementary material Table S6). Interestingly, more than 70% of these water molecules are buried and stable, with an average B factor of 19 Å 2 (Supplementary material Table S8). Due to their buried location, most of these water molecules show a low residence frequency during the MD simulation (Supplementary  material Table S8).

Invariant water molecules stabilize metal ions and conserved loop region in MMP-12
MMPs, not belonging to collagenases, gelatinases, stromelysins, matrilysins or MT-MMPs are categorized as other MMPs (Cui et al., 2017). Other MMPs include MMP-12, À19, À20, À21, À23A/B, À27 and À28 as its members. Due to the unavailability of the structures and a definite number of water molecules (>100), MMP-19, À20, À21, À23A/B, À27 and À28 were excluded from the analysis. In MMP-12, using 99 protomers, a total of 23 water molecules (hereafter referred to as M12IWn, where n represents the n th IW molecule) were identified as invariant (Supplementary material Tables S1 and S9). Out of these 23 water molecules, 10 IW molecules (M12IW2, M12IW4, M12IW6, M12IW7, M12IW9, M12IW10, M12IW13, M12IW15, M12IW19 and M12IW20) are located in the vicinity of the active and metal-binding sites ( Figure 5A). Three IW molecules (M12IW6, M12IW7 and M12IW9) interconnect the calcium ( 1 Ca) and the s Zn ions through the residues Asp194 and His196 ( Figure 5A and Supplementary material Table S6). Another six IW molecules (M12IW2, M12IW4, M12IW10, M12IW13, M12IW19 and M12IW20) form a water-mediated hydrogen-bond network around the other two calcium ( 2 Ca and 3 Ca) ions ( Figure 5A). The remaining 13 IW molecules are located at the backside of the active site ( Figure  5B). Among these, five IW molecules (M12IW3, M12IW11, M12IW14, M12IW17 and M12IW18) are located in the proximity of the calcium ( 2 Ca and 3 Ca)-binding sites and stabilize the conserved loop region ( Figure 5B and Supplementary material Table S6). The IW molecule M12IW17 is involved in holding the Met-turn region through the residues Ser207 and Tyr240 ( Figure 5B and Supplementary material Table S6). The residue Tyr240 is known to stabilize the substrates in the active site of the enzyme. This not only results in providing enough stability to the Met-turn region from its C-terminal side but also holds the two conserved loop and Met-turn regions in close proximity ( Figure 5B). Interestingly, 80% of these IW molecules are buried and stable with an average B factor of 13 Å 2 (Supplementary material Table S9). Also, presumably due to their buried location, they are observed to be less occupied (average RF ¼ 30%) by a water molecule during the MD simulation (Supplementary material Table S9).
3.6. Invariant water molecules present at the surface of MMPs stabilize the overall structure In addition to the individual MMP class, the IW molecules common among all the MMP classes (MMP-1, À3, À8, À9, À12 and À13) were also identified considering all the reference structures of each MMP class as a representative. Since The location of seven conserved water molecules present in all the MMPs (MMP-9, red; -1, limegreen; -3, skyblue; -8, limon and -13, orange). The conserved water molecules are shown as a cyan sphere, while those interacting with them are as a red sphere. The calcium and zinc ions are shown as magenta and green spheres, respectively. For the clarity of the figure, only a few of the residues are labeled.
the structure of MMP-9 contains the highest number of water molecules (328) among these structures, it was considered as the reference structure. An IW molecule of MMP-9 was considered to be conserved among all the other MMPs if there exists a water molecule in the mobile structure within its 1.8 Å radius. In this way, a total of seven water molecules (hereafter referred to as ACWn, where A represents all, CW represents common water and n represents the n th common water molecule) were found to be conserved among all the MMP classes (Supplementary material Table  S10). Apart from the presence of IW molecules in the vicinity of the important structural features and metal ions, their presence was also observed at the surface of all MMPs considered for this study. The details of the IW molecules conserved across the MMPs are provided in Table S10 (Supplementary material). These IW molecules interact with the amino acid residues present at the structural surfaces of MMPs, forming a cage ( Figure 6A and B and Supplementary material Table S10). As expected, these IW molecules show high SASA values and thus are only moderately conserved during the MD simulation due to their high flexibility (Supplementary material Table S10). All the IW molecules interacting with the structural elements in MMP-1, À3, À8, À9, À12 and À13 are completely buried except ACW5 (Supplementary material Tables S3-S5 and S7-S9). The water molecules located at the protein surface are reported to maintain their stability (Levy & Onuchic, 2004).

Discussion
Water molecules are considered to be an integral part of a protein. At the tertiary structure level of proteins, they provide stability by forming hydrogen-bond networks. In many protein families, water molecules are observed to be structurally conserved, providing structural and functional importance. For instance, the structural water conservation has been reported for cytochrome P450, phospholipase A 2 , b-glycosidase, D-xylose isomerase, bromodomains, ribose-1,5bisphosphate isomerase and caspase (Aldeghi et al., 2018;Dhanasekaran et al., 2013;Gogoi & Kanaujia, 2019;Kanaujia & Sekar, 2008;Sharma et al., 2021;Teze et al., 2013;Zhao et al., 2005). Apart from their role in providing structural stability to the proteins, water molecules have also been shown to be important for drug design (de Beer et al., 2010). Recently, the role of invariant water molecules, their dynamic behavior and the flexibility of the residues around the zinc ions ( s Zn and c Zn) have been shown in MMP-1 (Chakrabarti et al., 2017). In this study, the role of IW molecules is investigated around the metal ions (Ca 2þ and Zn 2þ ), CLR (or fibronectin type-II domain-containing loop), the catalytic domain and the Met-turn region in four classes of MMPs. In MMPs, the c Zn is strongly held by three histidines and a cysteine residue from the propeptide region of inactive MMPs. In the active MMPs, a water molecule replaces the cysteine residue (Jozic et al., 2005;Loffek et al., 2011). Intriguingly, the position of this water molecule is not conserved among MMPs. This suggests that although MMPs necessitate a water molecule for their activity, the spatial position of the water molecule is not conserved.
In collagenases, none of the IW molecules directly interact with the metal ions ( c Zn, s Zn and 1-3 Ca). However, the IW molecules provide stability to these metal ions (except c Zn) by interacting with the metal ion coordinating residues. The bridging of the s Zn and a calcium ion ( 1 Ca/ 2 Ca) in collagenases through an IW molecule (M1HOH979, M8IW4 and M13IW4 in MMP-1, À8 and À13, respectively) is notable. The water bridge formation between two metal ions might provide structural stability of MMPs, if not solely then partly by providing sufficient flexibility to them. Interestingly, the conformation of the conserved loop region in collagenases is anchored by an IW molecule (M1IW4 connecting the residues Thr204 and Asn211 in MMP-1). This suggests that the IW molecules M1IW4, M8IW7 and M13IW8 might play a similar function in different classes of collagenases. Apart from the CLR, the Met-turn region is also stabilized in the vicinity of the catalytic domain by an IW molecule. Additionally, the Met-turn and the CLR are interlinked either directly (in MMP-13) or indirectly via an IW molecule (M1IW1 and M8IW17 in MMP-1 and À8, respectively), which might provide flaccidity to the two imperative loop regions.
In MMP-1, IW molecules stabilize the loops L7 and the C-terminal loop through the residues Phe207 and Tyr240 ( Figure 1D and Supplementary material Table S6). The residue Tyr240 is known to stabilize the substrates in the active site of the enzyme. Interestingly, another residue, Arg214, which determines the substrate selectivity and the depth of the substrate-binding site, is anchored by the IW molecules. In MMP-8, although the IW molecules do not directly bind to the c Zn ion, they anchor s Zn and the calcium ions. Interestingly, IW molecules coordinate the residues involved in the formation of the hydrophobic pocket at the base of the substrate binding; they show large conformational changes in the inhibitor-bound protein (Pochetti et al., 2009). Furthermore, the residue Tyr227, which acts as a selective gatekeeper for the substrate/inhibitor binding, is anchored by the IW molecules. Similarly, in MMP-13, IW molecules stabilize the s Zn and calcium ions. In summary, in collagenases, the IW molecules hold and stabilize the conserved loop and Met-turn regions.
Unlike collagenases, in stromelysins, besides s Zn and three calcium ions ( 1 Ca-3 Ca), the IW molecules coordinate the c Zn ion. However, akin to collagenases, the IW molecules mediate the bridging between s Zn and 2 Ca. Further, the IW molecules dictate the folding of the conserved loop region. In stromelysins, an IW molecule (M3IW8) is conserved in the vicinity of the catalytic domain, which maintains its firmness. Therefore, the IW molecules are rather important in maintaining the firmness and stability of important structural features such as CLR (or the fibronectin type-II domaincontaining loop), the catalytic domain and the Met-turn region of MMPs. The role of IW molecules in gelatinases is almost similar to that of collagenases. Although no IW molecule is observed to be stabilizing the c Zn ion, they coordinate the s Zn ion and three calcium ( 1 Ca-3 Ca) ions. Also, although the IW molecules are not found to be interlinking the fibronectin type-II domain-containing loop with the Metturn region, they stabilize its N-terminal segment.
In other classes of MMPs (e.g. MMP-12) also, the IW molecules are found to be stabilizing the metal ions. Moreover, the conserved loop region (CLR) is stabilized by a series of IW molecules. The loop stability through the water molecules network has earlier been reported. For instance, the X-loop in class A b-lactamases is stabilized by hydrogen bond formation through six conserved water molecules located in a narrow, tunnel-shaped cavity (B€ os & Pleiss, 2008). Furthermore, in MMP-12 and MMP-13, the IW molecules replace the amino acid residues demonstrating their vitality. Such an observation has also been reported previously favoring their thermodynamic stability (de Beer et al., 2010;Li & Lazaridis, 2007). Altogether, in all the MMPs, the s Zn and the calcium ions are bridged through a conserved IW molecule. Also, the conserved loop and Met-turn regions are (in)directly connected by the IW molecules.
Interestingly, the majority of the IW molecules found interacting with different structural elements in MMPs are completely buried. This indicates that these water molecules might have interacted with their corresponding secondary structural elements prior to the folding of the MMPs. Buried water molecules generally form hydrogen bonds with their neighboring residues, which in turn leads to tightly packed groups in the protein interior due to favorable VdW interactions (Hubbard et al., 1994;Takano et al., 2003). The deeply buried water molecules present in the core of the protein provide stability to the folded structure of proteins. Moreover, all the MMPs possess a sufficient number of IW molecules at their surface, caging the overall folded structure. These water molecules can be weakly or strongly bound at the surfaces and possess dynamic properties that differ from the water molecules present at the protein core (Levy & Onuchic, 2004;Vukovic et al., 2016).
Thus, the results of this study suggest that the water molecules may mediate protein-ligand interactions by acting as connecting bridges between them. The IW molecules present in the vicinity of the active site of MMPs can be considered for structure-based drug development. Thus, it can be concluded that the availability and localization of the IW molecules in MMPs not only provide stability to the overall structure of the protein but also dictate their folding and thus, provide the requisite amount of flexibility to the MMPs. In summary, the IW molecules identified in this study might be utilized for designing potential drug-like molecules targeting all types of MMPs.

Author contributions
SPK conceived the project and guided the research. HK and SPK performed the molecular dynamics simulations. HK, SKM and SPK analyzed and validated the data. SKM, PG and SPK wrote the manuscript.