Gene expression profile analysis unravelled the systems level association of renal cell carcinoma with diabetic nephropathy and Matrix-metalloproteinase-9 as a potential therapeutic target

Abstract Type 2 diabetes (T2D) and cancer share many common risk factors. However, the potential biological link that connects the two at the molecular level is still unclear. The experimental evidence suggests that several genes and their pathways may be involved in developing cancerous conditions associated with diabetes. In this study, we identified the protein-protein interaction (PPI) networks and the hub protein(s) that interlink T2D and cancer using genome-scale differential gene expression profiles. Further, the PPI network of AMP-activated protein kinase (AMPK) in cancer was analyzed to explore novel insights into the molecular association between the two conditions. The densely connected regions were analyzed by constructing the backbone and subnetworks with key nodes and shortest pathways, respectively. The PPI network studies identified Matrix-metalloproteinase-9 (MMP-9) as a hub protein playing a vital role in glomerulonephritis tubular diseases and some genetic kidney diseases. MMP-9 was also associated with different growth factors, like tumor necrosis factor (TNF-α), transforming growth factor 1 (TGF-1), and pathways like chemokine signaling, NOD-like receptor signaling, etc. Further, the molecular docking and molecular dynamic simulation studies supported the druggability of MMP-9, suggesting it as a potential therapeutic target in treating renal cell carcinoma linked with diabetic kidney disease. Communicated by Ramaswamy H. Sarma


Introduction
Type-2 diabetes (T2D) is a metabolic disorder characterized by prolonged hyperglycemia, after which the skeletal muscle's sensitivity to insulin declines, resulting in insulin resistance. It is characterized by altered lipid and glucose metabolism leading to relative defects in insulin secretion by b-cells (Saltiel & Kahn, 2001). This results in an imbalance between energy intake and expenditure. Diabetes can affect different parts of the body and, over time, can produce longterm complications such as poor blood flow resulting in cardiovascular disease, stroke, and peripheral vascular disease, nervous system damage (neuropathy), renal system damage (nephropathy), and eye damage (retinopathy), ultimately leading to life-threatening situations (American Diabetes Association, 2006;Deshpande et al., 2008). Both type 1 and type 2 diabetics have up to a 40% possibility of developing diabetic nephropathy (DN) within 20-25 years of onset (Fukami et al., 2007;Remuzzi et al., 2002). As a result, DN is receiving much attention as it is linked to a higher risk of morbidity and death, especially in Western countries (Maisonneuve et al., 2000). T2D and its progression to comorbidities can, however, be avoided or treated by making lifestyle modifications or utilizing therapeutic drugs (R� ıos et al., 2015).
The therapeutic agents generally target the biological pathways involved in maintaining energy homeostasis to overcome insulin resistance (IR) and metabolic dysfunctions (Coughlan et al., 2014;Ruderman & Prentki, 2004). One such pathway is the AMP-activated protein kinase (AMPK) signaling pathway that coordinates cell growth, autophagy, & metabolism (Mihaylova & Shaw, 2011). AMP-activated protein kinase is a key enzyme that regulates cellular metabolism to maintain energy homeostasis and is hence known as the master regulator of metabolism (Ruderman & Prentki, 2004). It regulates diverse metabolic and physiological processes, dysregulation of which results in chronic diseases like obesity, inflammation, and diabetes leading to an increased risk of cancer (O'Neill, 2013;Ruderman et al., 2013;Saha et al., 2011). The most thoroughly understood mechanism of AMPK in regulating cellular activities is by suppressing the mammalian target of the rapamycin complex-1 (mTORC1) pathway (Mihaylova & Shaw, 2011). However, little is known about the mechanism linking diabetes and cancer. High glucose levels may be a prevailing factor contributing to this link, but the molecular mechanism behind the association between diabetes and cancer is still far from fully understood (Wu et al., 2018).
Patients with T2D have high sodium-glucose co-transporter 2 (SGLT2) expressions and an increased incidence of renal cell carcinoma (RCC) (Kuang et al., 2017). The therapeutic exploration of diabetes associated with kidney disease suggested that the renin-angiotensin system (RAS) as an important target for metabolic and hemodynamic pathways in diabetic nephropathy (Fukami et al., 2007). The angiotensin-converting enzyme inhibitor (ACE-I) and angiotensin II type I receptor (AT1R) antagonist are the most widely used therapeutic targets for the inhibition of RAS (Fukami et al., 2007). Apart from these, the factors like connective tissue growth factor (CTGF), vascular endothelial growth factor (VEGF), platelet-derived growth factor (PDGF), and mitogenactivated protein kinase (MAPK) have also been linked with the development and progression of diabetic nephropathy (Flyvbjerg et al., 2004;Fukami et al., 2007;Kelly et al., 2003).
The recent advancements in biological networks and high-throughput gene expression studies resulted in the revelation of molecular mechanisms of complex diseases (Goh et al., 2007;Harbison et al., 2004;Miryala & Ramaiah, 2022;Stelzl et al., 2005). The biological network studies are altering one's view of cellular and molecular biology by detailing indefinite possibilities to comprehend cellular organization (Chen et al., 2016;Miryala et al., 2021a;Rakshit et al., 2014). Unlike the traditional approaches that focus on individual proteins or genes, network-based protein-protein interaction (PPI) studies provide essential insights into unravelling complex molecular mechanisms (Sun & Zhao, 2010). Over a short period, more effective measures have come up that combine gene expression data with groups of genes expressed in common pathways. This involves identifying disease-specific markers, scoring them with known pathways, and evaluating the coherency of changes in gene expression (Subramanian et al., 2005). However, the pathway-based analysis has its limitations, as many human genes are not assigned with definitive pathways. The network-based approaches offer an effective means in this regard by providing and connecting potential disease-specific markers.
The network-based computational biology approaches have become powerful and informative tools for studying disease mechanisms at the molecular level (Bradley et al., 2008). Several studies have suggested the detection of disease-related networks by employing the co-expression network (Gerits et al., 2008), protein-protein interaction (PPI) network (Pham et al., 2008), protein phosphorylation networks (Wang et al., 2011), and the DNA methylation network (Krysan et al., 2005). Studying these networks, particularly the PPI network, provides valuable information on biological systems. In this study, using the genome-scale differential gene expression profiles and an integrated PPI network of AMPK in cancer, we identified and compared the protein network and the active subnetworks that interlink T2D with cancer. Further, the backbone network and its subnetworks were constructed using key nodes and their shortest paths, and the densely connected regions were investigated. In conclusion, our study provides novel insight into understanding the molecular mechanism interlinking T2D and cancer.

Microarray data collection
Gene Expression Omnibus (GEO) (https://www.ncbi.nlm.nih. gov/geo/) (Edgar et al., 2002) is a public functional genomics data repository with high throughput gene expression data. In the present study, a gene expression dataset [GSE85990] was downloaded from the GEO database to deduce the role of differentially expressed genes (DEGs) connecting diabetes with cancer (Wu et al., 2018). The probes were converted into appropriate gene symbols using the platform's annotation information. The GSE85990 data set contained 12 samples, including A2058-TET2WT, A2058-TET2M, and Mock cells treated under high-g and normal-g (Wu et al., 2018).

Identifying differentially expressed genes (DEGs)
The DEGs were identified using GEO2R (http://www.ncbi.nlm. nih.gov/geo/geo2r/), an R-based web application available in the GEO database (Barrett et al., 2013). The DEGs were selected with jlogFCj values not lesser than one and t-tests with p < 0.05.

Scanning protein-protein interactions
The candidate genes listed in Supplementary Table S1 were considered as the seed proteins. The STRING database (Szklarczyk et al., 2019) was used to build a PPI network of DEGs, and the interaction with a combined score of >0.4 was considered statistically significant. The constructed networks were analyzed to predict the functional relationships between proteins to deuce the mechanisms involved in disease onset or progression.

GO enrichment and KEGG pathway analysis
The Database for Annotation, Visualization, and Integrated Discovery (DAVID) was used to interpret the functions of the extensive gene set collected (Huang et al., 2007). The Gene Ontology (GO; http://www.geneontology.org) (Carbon et al., 2009) terms and the Kyoto Encyclopedia of Genes and Genomes (KEGG; http://www.genome.jp/kegg/) (Kanehisa et al., 2008) pathways were integrated to create the functionally organized GO pathway term network. Using the DAVID software, functional and pathway enrichment studies were undertaken separately for upregulated and downregulated genes. A difference of 0.05 was set as statistically significant to screen GO keywords and KEGG pathways.
consisting of seed proteins with their direct neighbors in the network along with their interactions was constructed. A high confidence (0.700) set of interactions was used to construct the PPI network. Further, these PPI networks were visualized using Cytoscape 3.3.0 tool (Su et al., 2014), and the default parameters were used to calculate network node properties. A single extended giant network was constructed from the DEGs under consideration. The nodes with large BC values in the giant network and their parameters related to the network theory were used for topological analysis. To identify the hub protein(s) from the giant network, the bottleneck method implemented in Cytohubba, a plugin for Cytoscape, was used (Chin et al., 2014).

Topological analysis of protein interaction network
The nodes of the giant network and subnetwork were evaluated by adopting topological parameters like connectivity degree (k), betweenness centrality (BC), and closeness centrality (CC) (Hwang et al., 2008;Raman, 2010). The flow in the network is greatly influenced by a node with high BC, which significantly detect bottlenecks in a network. Other parameters like average clustering coefficient, mean shortest path length, neighborhood centrality distribution, closeness centrality, and diameter are also used to characterize a network (Raman, 2010). An average degree < k> represents the mean of all degree values of nodes in a network. Here, we used the Network Analyser plugin in Cytoscape software (Assenov et al., 2008) to characterize the node parameters and network measurements.

Backbone network creation using high BC values
The nodes in a network can be measured by considering the betweenness centrality (BC). It is the shortest network path that connects nodes with high BC. These nodes function as bottlenecks by communicating between other nodes in the network. For a protein, the higher the BC higher will be its intersections. These proteins and their links together constitute a backbone network. In the present study, the higher limit for BC was set to 5% (Goñi et al., 2008;Kim & Kim, 2009), and the backbone network was constructed using high BC nodes and the links connecting them.

Module analysis of the PPI network
A biological network comprises of various processes contributed by several subnetworks or functional modules (clusters of proteins). These modules will influence the specific functionality of participating nodes, even for those that do not have an impact on the core network (Mitra et al., 2013). The global network was subjected to module analysis to identify densely connected regions using a graph-theoretic clustering algorithm, 'Molecular Complex Detection' (MCODE) V.4.1, as a plugin for Cytoscape. The nodes were weighted based on their local neighborhood density to detect densely connected regions in the network (Bader & Hogue, 2003). All the parameters, such as degree threshold (2), node score threshold (0.2), k-core threshold (4), and Max depth of network (100), were kept at default. The MCODE was ensured to be unaffected by the expected high false-positive rate in the large-scale interaction dataset of the whole network. Subsequently, the KEGG and GO analyses for genes in this module were performed using the DAVID software.

Selection of natural products as drug leads
To assess the druggability of the predicted target, in-silico molecular interaction studies were performed. An in-house screening method was employed to filter suitable drug leads. The detailed methodology is mentioned in our previous publications . Briefly, a natural product library consisting of 26,609 structures was constructed. The library was screened against synthetic drug library to collect the natural products structurally most similar to synthetic drugs. For the comparison, molecular properties, two and three-dimensional structural similarities, activity cliffs, and core fragments of natural products (NPs) with chemical drugs were considered. This comparison yielded natural products; Isoquercitrin, Naringin, Kaempferol-3-neohesperidoside, Rutin, Hesperidin, Procyanidin, Phlorizin, and Mangiferin structurally most similar to an SGLT-2 inhibitor; Luseogliflozin . Sodium-glucose co-transporter 2 (SGLT2) protein (PDB ID: 2XQ2) was docked with the identified molecules to assess their probable mode of interactions . Based on these results, Isoquercitrin and Rutin were selected for the present studies.

Molecular docking studies
The automated docking was performed to identify the potential MMP-9 inhibitor (Raghavendra et al., 2015). The selected natural products, Isoquercitrin and Rutin, were docked with the MMP-9 target, and the bound structures were chosen for proteins' structural stability assessment. Briefly, the macromolecule was collected from Protein Data Bank (PDB ID: 5CUH) and prepared for molecular docking studies as described in our earlier publications Janakirama et al., 2021). The Broyden-Fletcher-Goldfarb-Shanno algorithm implemented in AutoDockVina was used to study the binding interactions between the rigid macromolecule and flexible ligand (Aditya et al., 2020;Trott & Olson, 2010).

United-atom molecular dynamic simulation studies to predict protein stability
The physical movement of atoms and molecules, along with the structural stability of the macromolecule upon ligand binding, was assessed using MD simulations run for a time scale of 100 nanoseconds (ns) (Aditya Rao & Nandini Shetty, 2021). For simulation studies, CHARMM36 all-atom force field (July 2020) (Best et al., 2012) force field implemented in GROMACS package (Abraham et al., 2015), version 2018, was employed. In a brief, the protein was enclosed in a periodic cubic solvated box, with the box's edge at least 10 distant from the protein. The simple point charge (SPC) model (Berendsen et al., 1981) was used for solvating the system, neutralized with sodium and chloride ions. The ligand topologies were created using CHARMM General Force Field (CGenFF) (https://cgenff.umaryland.edu/) (Vanommeslaeghe et al., 2010). The steepest descent method was employed for Energy minimization. The temperature (300 K) and pressure coupling (10 5 Pa) were done using a V-rescale thermostat (Bussi et al., 2007) and Parrinello-Rahman barostat (Parrinello & Rahman, 1981), respectively. LINCS algorithm was used to adjust bond length constraints (Hess et al., 1997), and the electrostatic interactions were evaluated using the particle mesh Ewald method (PME) (Essmann et al., 1995). The final MD trajectories were prepared at a time step of 2fs with trajectory coordinates regularly written at 10ps intervals for 100 ns. The gmx energy, gmx rms, gmx rmsf, gmx gyrate, gmx do_dssp, and gmx sasa modules of GROMACS and the interaction energies were used to analyze the production MD trajectories (Aditya Rao & Nandini Shetty, 2021).

Biding free energy calculations using g_mmpbsa
The Molecular mechanics/Poisson-Boltzmann surface area (MMPBSA) calculations were performed by creating a new trajectory consisting of the final 10 ns trajectories of the production trajectory with frames generated at every 200 picoseconds (ps) (Miller et al., 2012). The g_mmpbsa package (Kumari et al., 2014) was employed to calculate the binding energy, which relies on the following equations; where, G Complex ¼ total free energy of the complex, G Protein and G Ligand ¼ total free energies of protein and ligand, E MM ¼ vacuum potential energy; G Solvation ¼ free energy of solvation.

Identification of differentially expressed genes (DEGs)
The GEO2R method was used to identify DEGs from the data set to decipher the link between diabetes and cancer. A gene expression study [GSE85990] suggested a regulatory pathway linking glucose and AMPK to TET2, and 5-hydroxymethylcytosine (5hmC), in turn connecting diabetes to cancer (Wu et al., 2018). Based on the initial data, the inclusion criteria were set to p < 0.05 and jlogFCj � 1.0. The study yielded 415 DEGs, of which 81 were upregulated and 334 were down-regulated. The volcano plot of all DEGs and the heat map of the upregulated and downregulated genes are shown in Figures 1 and 2, respectively.

GO functional enrichment and pathway analysis
The GO functional enrichment analysis of both the upregulated and downregulated DEGs was carried out using the DAVID tool. The top 5 enrichment studies for each part of the gene ontology (GO) analysis are given in Figure 3.

Construction of PPI network
One giant network and three separated small components constituted the extended network. The small components  Table S2). For the giant network, the largest degree and BC values were 36 and 0.1478, respectively (Table 1). It was further  observed that the giant network was characterized by a small number of highly connected nodes, with few nodes having very few conditions which are the classical character of the PPI network (Lima-Mendez & Van Helden, 2009). The backbone network for PPIs was constructed from 14 nodes with higher BC values. The backbone network has 29 links ( Figure 6), of which CXCL8 is located at the center with the CC value of 0.619, controlling the flow of information in the backbone network (Table  1). Meanwhile, MMP-9 with a high BC value has six first neighbors, namely CXCL8, COL6A1, VCAN, SOCS3, IL1B, and PTGS2.

Hub protein(s) in the PPI
In the present study, Cytohubba was used to predict the hub proteins. The classification method in cytoHubba was employed to select the top 10 hub proteins based on the bottleneck ranking method (Table 3). The subnetwork was constructed from 94 nodes and 376 edges, highlighting the top ten proteins with high degree and BC values in the giant network (Figures 6 and  7). The MMP9 was identified as the hub protein having the highest BC value (0.1478) and second-largest degree (34) ( Table  1). The remaining nine neighbors share lower BC, degree, and CC values with the identified hub protein.

Key nodes of the backbone network
The application of network or graph theory can make biological network analysis simpler and more achievable. It aids in determining the possibility of previously undiscovered links between proteins and genes (Nibbe et al., 2011). The PPIs display the topological arrangements correlating the importance of proteins. These connections indicate the importance of genes or proteins with their topological roles and help classify them based on their locality, known as hubs (Estrada, 2006). The present study identified 24 nodes with larger degree values and 30 with large BC values (Tables 2 and 3). Out of 24, the top 15 nodes with larger degrees and BC values were selected (Table 4). Thus, the Matrix-metalloproteinase-9 (MMP9) was found as a hub protein with the highest BC value and second-highest degree, while Interleukin-1 beta (IL1B) with the second-highest BC value and largest degree. Interleukin-8 (CXCL8) has the  highest CC value, which justifies its location at the center of the network. Figure 3 represents the corresponding roles of key nodes in the network and are highlighted in different colors with node sizes proportional to their BC values.

Module analysis
The Molecular Complex Detection (MCODE) algorithm was employed for clustering analysis. The clusters were filtered based on the parameters mentioned in the methodology section to ensure the efficiency of the functional partners. We identified four efficient clusters based on the MCODE score and the minimum number of nodes for clusters (set to 8). The cluster scores of giant networks were illustrated in Table 5, and the participating proteins in each cluster (C1-C4) were demonstrated in Figure 8.

Molecular docking studies
The automated docking studies help predict the most effective binding of selected molecules with their appropriate targets (Janakirama et al., 2021;Malathi & Ramaiah, 2018;Sukanya et al., 2022). The natural products were selected considering their structural similarities with the chemical drugs. As structurally similar molecules are expected to exert similar action (Gfeller et al., 2014;Kumar & Zhang, 2018;Martin et al., 2002), the selected molecules were studied for their interactions with the target of the structurally similar chemical drug, Luseogliflozin (Table 6). Among the selected natural products, Isoquercetrin bound to MMP-9 with a most negative binding energy of À 9.3 kcal/mol, while the Rutin molecule displayed a most negative binding energy of À 9.9 kcal/mol. The binding of natural products was found to be better than their synthetic drug counterpart, Leseogliflozin (À 8.4 kcal/mol) in terms of binding energy. The binding interactions were found to be stabilized by the formation of hydrogen bonds (Figure 9). Further, molecular dynamics (MD) simulation studies were performed to validate the effectiveness of binding and binding stability (Miryala et al., 2021b).

United-atom molecular dynamic simulation studies to predict protein stability
In the present study, the MD trajectory analysis revealed the equilibrium state attained by the structure observed for a duration of 100 ns. The average potential energy was found to be À 0.4393 � 10 À 6 for unbound MMP-9 structure, À 0.4389 � 10 À 6 for Luseogliflozin bound structure, À 0.4388 � 10 À 6 for Isoquercitrin bound structure, and À 0.4383 � 10 À 6 for Rutin bound structure. The consistency in average potential energy of bound and unbound structures indicates that the structures have reached equilibrium. The structural stability assessment was done by considering the root mean square deviations (RMSD), root mean square fluctuation (RMSF), the radius of gyration (Rg), and solvent accessible surface area (SASA) (Figure 10). The RMSD measures the deviation of Ca atoms of the protein from its backbone. The RMSD plot (Figure 10a) indicated that native, Luseogliflozin, and Isoquercitrin-bound structures attained equilibrium after a series of initial structural refinements within 5 ns. The average RMSD of native, Luseogliflozin and Isoquercitrin-bound structures was found to be 0.1543 nm, À 0.1477 nm, and 0.1520 nm, respectively, while the Rutin-bound structure attained equilibrium after 20 ns and retained its stable confirmation until the completion of the simulation with an average RMSD of 0.2068 nm ( Table 7). The RMSF was calculated by measuring fluctuations associated with the backbone residues of the macromolecules observed during the simulation. The average RMSF was minimum in the Luseogliflozin-bound structure with an average of 0.0730 nm. The Isoquercitrin and Rutin-bound structures also displayed almost similar fluctuations. This can be inferred from the RMSF plot (Figure 10b). Their average RMS fluctuations were found to be 0.0973 nm and 0.0966 nm,  respectively. The radius of gyration (Rg) detailing the packability of secondary structures also inferred constant structural arrangements (Figure 10c). This was also supported by SASA analysis (Figure 10d). However, some variations were observed in the Isoquercitrin-bound structures after 80 ns of simulation. Though the structure was under equilibrium, the rearrangement of secondary structures could have caused these variations. The average Rg and average SASA are given in Table 7. The binding free energy calculations were performed using the g_mmpba module. The results indicated that the natural product, Rutin, binds to MMP-9 with a minimum binding energy of À 103.013 kJ/mol compared to Isoquercitrin (À 91.608 kJ/mol). The results were compared with the binding energy of the standard drug, Luseogliflozin, which displayed minimum binding energy of À 92.118 kJ/mol. Plot 10e (Figure 10e) indicates the total free energy of each residue in Luseogliflozin, Isoquercitrin, and Rutin-bound structures. The associated terms for binding free energy calculations are detailed in Table 7.

Discussion
Mining for differentially expressed genes (DEGs) and building a PPI network is one of the effective ways to explore the biological significance behind cellular homeostasis and disease (Gollapalli et al., 2021). Thorough mining of the proteomic data can unravel novel pathogenic mechanisms and contributing factors underneath normal and disease conditions.  Furthermore, It can also serve as a foundation for designing new therapeutic strategies by identifying potential targets (Sardiu & Washburn, 2011;Xia et al., 2014). Type 2 diabetes and cancer share many common risk factors. However, the potential biological link that connects the two is still unclear (Giovannucci et al., 2010;Wu et al., 2018). The most thoroughly understood mechanism of AMPK in regulating cellular activities is by suppressing the mammalian target of the rapamycin complex-1 (mTORC1) pathway and is one of the most extensively studied mechanisms to understand this connecting link (Mihaylova & Shaw, 2011). On the other hand, the G-protein coupled receptor protein signaling pathway, focal adhesion, MAPK signaling pathway, and neuroactive ligand-receptor interaction have also been studied to understand their involvement in carcinoma development (Liu et al., 2015). However, the experimental evidence suggests that numerous such pathways may be involved in developing cancerous conditions associated with diabetes (Dong et al., 2020). Wu et al. (2018), in their study, described TET2 protein (The Ten-Eleven Translocation-2), a substrate of AMPK, as a tumor suppressor, acting as a connecting link between diabetes and cancer (Wu et al., 2018). At high blood glucose conditions, TET2 protein is destabilized, causing the dysregulation of 5-hydroxymethylcytosine (5hmC) and the tumor suppressive function of TET2. Thus, the TET2 protein acts as a switch regulating the pathway linking glucose and AMPK to TET2 and 5hmC.
In the present study, we used the gene expression dataset [GSE86376] to establish a link between glucose and AMPK to TET2 and 5-hydroxymethylcytosine (5hmC). Using the protein-network-based approach, we identified a hub protein subnetwork with functional insight associated with a comorbid condition of T2D and cancer. The giant network constructed in the study revealed MMP9 as a hub protein with the highest BC and the second-largest degree value. MMPs (matrix metalloproteinases) are proteolytic enzymes that belong to the zinc-dependent endopeptidases family and can degrade almost all proteinaceous extracellular matrix components (ECMs) (Garcia-Fernandez et al., 2020;Tan & Liu, 2012). MMPs are recognized to play a vital role in various renal disorders, including glomerulonephritis, tubular diseases, and some genetic kidney diseases (Li et al., 2014;Zakiyanov et al., 2019). They also have an essential role in the extracellular matrix's local proteolysis and leukocyte migration. Increasing evidence in recent years has suggested the activity or expression of MMPs has a vital role in renal disease in diabetic nephropathy patients (Garcia-Fernandez    CXCL10, MMP8, CXCL2, MMP9, CXCL3, SPP1, PTX3, SOCS3, TIMP1, TNFAIP3, HMGB1, IL11, TNFRSF11B  C2  6.5  9  26  0.871  PLOD2, COL6A3, LAMA4, COL11A1, COL6A1, COL13A1, NCAM1, COL8A1, LEPREL1  C3  6  6  15  1.0  UTS2, P2RY1, CHRM3, PIK3R3, PLCB1, F2RL1  C4  5.5  13  33  0.800  SOD2, CCL2, VCAN, ISG15, PTGS2, PRSS23, IGFBP3, GAL, IL1B, WFS1, APLP2, CXCL8, CXCL1 et al., 2020;Shalaby et al., 2021;Xu et al., 2014). The studies suggest the down-regulation of MMPs or up-regulation of tissue inhibitors of metalloproteinases (TIMPs) in the kidney could contribute to fibrosis (Garcia-Fernandez et al., 2020;Lu et al., 2011;Tsai et al., 2012). TIMPs play an important role in cell signaling as they inhibit MMPs and ADAMs (a disintegrin and metalloproteases) by restricting the formation of cancer, inflammation, and degenerative diseases (Raeeszadeh-Sarmazdeh et al., 2020). Apart from this, MMPs are also associated with the release of different growth factors such as tumor necrosis factor (TNF-a), transforming growth factor 1 (TGF-1), and others (Thrailkill et al., 2009) which can be exploited for new therapeutics development (Fields, 2019). Our cluster analysis results showed an average degree of six, signifying each node is connected to approximately seven other nodes in the network (Barab� asi & Oltvai, 2004).
The genes MMP9, MMP8, TIMP1, SPP1, TNFRSF11B, CXCL10, CXCL2, CXCL2, CXCL3, TNFAIP3, TNFRSR118, HMGB1, PTX3 are mainly involved in the biological process like extracellular matrix disassembly, inflammatory response, response to lipopolysaccharide, regulation of cell proliferation and chemokine-mediated signaling pathway. The MMP9 identified as a hub in the giant network is also involved in the metallopeptidase activity. In the KEGG pathway enrichment analysis, the MMP9 was found to be involved in the TNF signaling pathway; while the genes CXCL10, CXCL2, CXCL3, IL11 are involved in cytokine-cytokine receptor interaction and chemokine signaling pathways; while CXCL2 and TNFAIP3 are involved in the NOD-like receptor signaling pathway Zhou et al., 2009).
Epidemiologic studies have suggested that patients with type1 and type 2 diabetes have a high chance of developing   malignancies (Habib et al., 2012). In search of a potential therapeutic target for diabetes-related renal cell carcinoma (RCC), Matrix metalloprotease 9 has emerged as a promising candidate due to its involvement in the degradation of extracellular matrix components, tissue remodeling, cellular receptor stripping, and processing of various signaling molecules (Kaminari et al., 2018). Studies have demonstrated that inhibition of MMP-9 activity resulted in increased amyloid formation and the resultant b-cell apoptosis in an islet culture model (Aston-Mourney et al., 2013;Meier et al., 2015). Further, the same study mentioned the reduced MMP-9 mRNA expression in islets of subjects with T2D compared to non-diabetic controls (Aston- Mourney et al., 2013;Kaminari et al., 2018). There is also a wellestablished connection between insulin and MMP-9, wherein insulin stimulates MMP-9 activation through insulin receptor activation (Fischoeder et al., 2007). The mechanism behind insulin receptor activation may involve crosstalk between MMP-9 and Neu1 protein (Alghamdi et al., 2014). In line with these studies, the current research supports the fact that MMP-9 could constitute a promising target in order to interfere with the development and progression of T2D into cancer.

Conclusion
The extension of T2D to cancer, especially renal cell carcinoma (RCC), is alarming and of grave concern. Early and careful screening of diabetes and its extension to cancer are currently the best methods to prevent the occurrence of diabetes-linked RCC. Therefore, the need for effective screening strategies to diagnose this efficiently is of utmost importance. With vast data generated from omics studies, the systems biology methods are increasingly explored to predict new perspectives for future disease-associated studies. In the present investigation, we studied the association between T2D and cancer by combining gene expression profiles and the comprehensive biological network, including PPIs, along with metabolic and regulatory links. Our study has unraveled the possible association between diabetic kidney disease and renal cell carcinoma, mainly relying on the involvement of MMP-9 proteins in several pathways in digestive metabolism, immunization, and signal transduction. In conclusion, our network-based association approach has provided additional insight with a systematic explanation for the close association between diabetic kidney disease and renal cell carcinoma compared to individual gene-based studies. However, a meticulous experimental design to understand the molecular mechanism of MMP9 activity in the tumorigenesis process could render MMP-9 as a potential therapeutic target.