New putative therapeutic targets against Serratia marcescens using reverse vaccinology and subtractive genomics

Abstract The Gram-negative bacillus Serratia marcescens, a member of Enterobacteriaceae family, is an opportunistic nosocomial pathogen commonly found in hospital outbreaks that can cause infections in the urinary tract, bloodstream, central nervous system and pneumonia. Because S. marcescens strains are resistant to several antibiotics, it is critical the need for effective treatments, including new drugs and vaccines. Here, we applied reverse vaccinology and subtractive genomic approaches for the in silico prediction of potential vaccine and drug targets against 59 strains of S. marcescens. We found 759 core non-host homologous proteins, of which 87 are putative surface-exposed proteins, 183 secreted proteins, and 80 membrane proteins. From these proteins, we predicted seven candidates vaccine targets: a sn-glycerol-3-phosphate-binding periplasmic protein UgpB, a vitamin B12 TonB-dependent receptor, a ferrichrome porin FhuA, a divisome-associated lipoprotein YraP, a membrane-bound lytic murein transglycosylase A, a peptidoglycan lytic exotransglycosylase, and a DUF481 domain-containing protein. We also predicted two drug targets: a N(4)-acetylcytidine amidohydrolase, and a DUF1428 family protein. Using the molecular docking approach for each drug target, we identified and selected ZINC04259491 and ZINC04235390 molecules as the most favorable interactions with the target active site residues. Our findings may contribute to the development of vaccines and new drug targets against S. marcescens. Communicated by Ramaswamy H. Sarma


Introduction
Previously, in 1819, the Gram-negative bacillus Serratia marcescens, a member of Enterobacteriaceae family, was long thought to be nonpathogenic. This characteristic led it to be used as a tracer organism in medical experiments and as a biological warfare test agent (Mahlen, 2011;Thomson et al., 2000). Later, S. marcescens was identified as an opportunistic nosocomial pathogen that can cause some infections like urinary tract infections, pneumonia, bloodstream infections, endocarditis, central nervous system infections, such as meningitis, and wound infections (Mahlen, 2011;Mills & Drew, 1976;Wu et al., 2013). Over the years it became evident that S. marcescens is not the primary invasive infection but easily infects patients with debilitating or immunocompromising disorders, mainly those treated with broad-spectrum antibiotics (Mahlen, 2011;Wilkowske et al., 1970), some behavior that can be evidenced worldwide. The major bacteria spreading mechanism is through invasive instrumentation that is very used in these patients and includes the indwelling urinary catheters, intubation material and intravenous catheter. This explains why the mains routes to infection occur in the urinary tract, bloodstream and respiratory tract (Maki et al., 1973;Sader et al., 2020;Voelz et al., 2010).
Nowadays, S. marcescens is commonly found in outbreaks, mainly in Neonatal Intensive Care Units (NICUs), Intensive Care Units (ICUs) and other hospitals areas (Ferreira et al., 2020;Raymond et al., 2000;Saralegui et al., 2020;Si sirak & Huki c, 2013;Zingg et al., 2018). The data about outbreak reports underestimates the true occurrence of S. marcescens in neonatology units and NICUs (Zingg et al., 2018). In fact, this bacteria have an occurrence of 15% of all isolates from nosocomial infections in European NICUs (Raymond et al., 2000) and is the fifth Gram-negative bacteria that cause bloodstream infections in pediatric patients from United States (US) medical centers (Sader et al., 2020). Usually, strains of S. marcescens involved in epidemic events are multidrug-resistant (Cristina et al., 2019).
A study about the lung damage during S. marcescens pneumonia in mice model indicates that the pathogenesis of this bacteria is, despite a rapid depletion of alveolar macrophages (AM) by necroptosis, damaging mitochondrial membranes, inducing the ATP depletion via a cytoplasmic leak, and driving reactive oxygen species (ROS) production (Gonzalez-Juarbe et al., 2018;Gonz alez-Juarbe et al., 2015). Moreover, the adherence has a role in the S. marcescens virulence and pathogenicity, once the contact of the bacteria with epithelial cells was essential to their cytotoxicity (Krzymi nska et al., 2012). This induction of necroptosis in AMs not only abolishes a crucial element of the early immune response but also causes inflammation and tissue damage, which intensifies the disease (FitzGerald et al., 2020). The adherence has been related to piliation, with pili adhesins, mannose-resistant and mannose-sensitive, causing erythrocytes agglutination. The renal damage caused by S. marcescens infections does not appear to be related directly to bacterial growth in the kidney, but to the inflammatory process, particularly infiltration by polymorphonuclear leucocytes that together to mannose-sensitive piliate bacteria stimulate the production of active oxygen radicals (Hejazi & Falkiner, 1997). Other virulence factor is related to the lipopolysaccharides (LPS) biosynthesis and iron uptake. A screening using a Caenorhabditis elegans as host to detect virulence factors for S. marcescens found gene mutations of enzymes involved in LPS production that attenuated the virulence in the nematodes. This work also suggested that the bacteria exerts virulence killing host immune cells to impair cytokine production and other immune responses (Kurz et al., 2003). Moreover, mutants of the motility-related genes wecA gene (responsible for LPS synthesis) and the flhD and fliR genes (responsible for flagella synthesis), had severely impaired host cell-killing phenotypes indicating that S. marcescens motility is critical to the high pathogenicity (Ishii et al., 2012).
Once S. marcescens strains are intrinsically resistant to a lot of antibiotics, the options of treatment for these organisms are fewer. During the last two decades, the treatment for these infections was empirically done with the use of piperacillin-tazobactam, a fluoroquinolone, an aminoglycoside or a carbapenem. Afterwards, based on strains susceptibility test results, if available, the treatment was then modified (Mahlen, 2011). The emergency for studies about S. marcescens highly increased in the last years because of its resistance to antibiotics, including that previously used in its treatment, like aminoglycosides, fluoroquinolones and thirdgeneration cephalosporins (Herra & Falkiner, 2018;Lockhart et al., 2007). Recently, a study showed that S. marcescens has an intrinsic resistance to polymyxin, considered to be the final alternative for treating carbapenem-resistant Enterobacteriaceae pathogens (Leclercq et al., 2013). The WHO published a priority list for research and development of new antibiotics for antibiotic-resistant bacteria. The priority list was categorized into three tiers: critical, high and medium (WHO, 2017). Some members of Enterobacteriaceae family that are carbapenem-resistant and third-generation cephalosporin-resistant (the best available antibiotics for treating multi-drug resistant bacteria), including Serratia sp., were categorized with critical priority for the need of new drugs (Tacconelli et al., 2018;WHO, 2017).
The resistome studies also show that the genes related to antibiotic resistance are intrinsic in the genomes between S. marcescens, including in historical strains collected at the end of the 1960s, a pre-antibiotic period when some antibiotics and disinfectants were not commercialized yet. This finding suggests that the resistant strains were selected as new antibiotics were being used in the health treatment network (Saralegui et al., 2020). This behavior is observed in many other microorganisms every time that a new antibiotic emerged and will eventually result in antibiotic useless. Vaccines today are a good increment to control infections over a long period of time without becoming obsolete because it acts prophylactically and prevents the start of infections, while drugs work therapeutically on an ongoing infection. This avoids that a bacterium proliferate and mutate, and consequently decrease the selection of the multidrug-resistant variants (Tagliabue & Rappuoli, 2018;van Hoek et al., 2011). In addition, if vaccination is done in sufficiently high levels of coverage, it may provide indirect protection by decreasing the risk of infection in the community and protecting vulnerable people, which eventually may not be vaccinated or with weak immunity (e.g. neonates, immunocompromised and older people) (Mallory et al., 2018).
In the last years, several vaccine candidates are being studied for organisms with antimicrobial resistance including those that demand a high antibiotic use or nosocomial infections (surgical site infections and urinary tract infections) (Buchy et al., 2020). Moreover, vaccines have been shown advantageous for antimicrobial-resistant organisms due to their effectiveness in preventing infections, which has hugely improved, as a consequence of the enormous technological developments. The new techniques of genome sequencing accelerated considerably the process of discovering novel vaccine candidate, with the help of bioinformatics tools (Tagliabue & Rappuoli, 2018). With reverse vaccinology approach, using the information of whole genome sequences and bioinformatics methods, it is possible to predict the potential vaccine candidates without necessarily handling the pathogen (Muzzi et al., 2007;Rappuoli, 2000). For this, some described features of effective vaccine candidate proteins include the sub-cellular localization (the presence of signal peptides and transmembrane domains, for instance) and antigenic epitopes (Muzzi et al., 2007).The first vaccine produced and licensed using this approach was designed against meningococcus B (Serruto et al., 2012). A number of other vaccine candidates against pathogens are in development using reverse vaccinology including, Acinetobacter baumannii, Neisseria meningitidis, Treponema pallidum, Mycoplasma pneumoniae, among others (Kelly & Rappuoli, 2005;Kumar Jaiswal et al., 2017;Ni et al., 2017;Vilela Rodrigues et al., 2019). This suggests the great potential of these new bioinformatic platforms for prospecting vaccine antigens with the ability to induce protective immune responses.
The approaches that use a comparative and subtractive genomic together with metabolic pathways analysis bring robust information about non-host homologous pathogen essential proteins, which are the putative targets for vaccines and drugs development. Recently, many vaccines against bacterial pathogens have been developed through this approach (Barh et al., 2011). In contrast with the traditional drug discovery that is laborious, slow and expensive, a computational approach is an attractive way to identify potential drug targets because it accelerates the drug discovery process, increases treatment options, and reduces drug failure rate in clinical trials (Shanmugham & Pan, 2013). The bioinformatics methods predict a potential drug target based on its selectivity/specificity and essentiality, subcellular location, the ability to act as a broad-spectrum target, functional interaction with metabolic proteins, cellular function, and druggability (Shanmugham & Pan, 2013).
Thus, since the control of infections by multi-drug resistant S. marcescens is urgent and new forms are needed for the treatment of these bacteria, this work aims to predict potential vaccines candidates using reverse vaccinology as well as to prospect for new potential species-specific drugs against this pathogen.

Genomes
From the GenBank database of the NCBI -National Center for Biotechnology Information (https://www.ncbi.nlm.nih.gov/ genbank), we downloaded the 59 complete genomes of S. marcescens strains available (Table 1) and converted to the FASTA format for the bioinformatics analysis.

Identification of conserved proteins of S. marcescens and subtractive genomics
The amino acid sequences in FASTA format files were submitted to the software OrthoFinder (Emms & Kelly, 2015) using default parameters. Through searches in BLAST and the MCL clustering algorithm, we identified the regions of homology, where the software creates the orthogroups with the protein sequences. After, the genes were classified into three groups: core genes, those present in all studied strains; shared genes, those present only in some strains; and the singletons, the genes that are strain-specific (Jaiswal et al., 2020). Here, we considered to be part of the core genome the CDSs (Coding DNA Sequences) shared by all 59 S. marcescens strains. For the subtractive genomics, an essential step to the selection of drug targets or vaccines is the identification of those that are not prone to cause autoimmunity.
For this, still using OrthoFinder, the S. marcescens amino acid sequences were submitted to BLASTp against the human genome proteins to identify the non-homologous proteins, i.e. those without homology with host proteins (Mondal et al., 2015).

Prediction of subcellular localization
The SurfG þ software predicts the subcellular localization of the proteins of interest based on identifying signal peptides, retention signals, transmembrane helices, and secretion pathways to classify proteins as secreted, PSE (putatively surfaceexposed), and membrane proteins (Barinov et al., 2009). So, we analyzed the S. marcescens non-host homologous conserved proteome previously identified and the cytoplasmic proteins were subjected to analysis for potential drug targets, due to their involvement in the basic survival processes of the organism. The membrane, PSE, and secreted proteins were directed to reverse vaccinology analysis once they are the first proteins to come into contact with the immune response of the host (Vilela Rodrigues et al., 2019).

Prediction of antigenic properties for vaccine candidates
To analyze the antigenic properties of proteins to vaccine candidate we used the Vaxign system (http://www.violinet. org/vaxign/), a pipeline of software programs which predict possible vaccine based on various criteria that includes antigen sublocation, transmembrane helices, adhesion probability, conservation to human and/or mouse proteins, sequence exclusion from genomes of nonpathogenic strains, and epitope binding to MHC (Major Histocompatibility complex) class I and II. The Vaxitope is an internally developed program that through searches in the Immune Epitope Database (IEDB) predicts the protein affinity with MHC class I and class II binding epitopes. The optimized SPAAN program predicts the adhesion probability with a sensitivity of 89% and specificity of 100% in a default cut-off of 0.51 (He et al., 2010). In our analysis, we use the 'Dynamic Vaxign Analysis' and the cut-off > 0.7. In addition, we searched for more antigenic features important for vaccine development with the S. marcescens proteins selected. For protein identification and prediction of its functional domains we used the Universal Protein Resource (UniProt) database (Consortium, 2010). We also verified the presence of signal peptides in vaccine candidates using SignalP program, which localizes cleavage sites of signal peptides involved with secretory pathway directing (Almagro Armenteros et al., 2019;Petersen et al., 2011). The transmembrane helix number is an important factor for vaccinology due to the difficulty to purify proteins with more than one transmembrane helix. Therefore, we used the TMHMM server to predicts the topology of these proteins by the Hidden Markov method (Krogh et al., 2001).

Prediction of drug targets
The cytoplasmic proteins were used to predict the best candidates to drug targets. Thus, the proteins were submitted to MHOLline workflow (http://www.mholline.lncc.br) which combines a set of programs for automated protein structure prediction, detection of transmembrane regions, and Enzyme Commision number association. These programs are the HMMTOP (Prediction of Transmembrane helices and Topology of Proteins) to identify transmembrane regions; BLAST algorithm, as template structure identification by performing searches against the Protein Data Bank (PDB); BATS (Automatic Targeting for Structures), that identifies by comparative modeling techniques the better template sequences and the 3D model construction quality from 'very high' to 'very low'; MODELLER, that performs the automated alignment for homology or comparative modeling of protein three-dimensional structures and PROCHECK, which verifies the stereochemical quality. Then, MHOLline generates and aggregates structural information, returns a 3D model, a Ramachandran plot, structure quality and enzymatic function data (Capriles et al., 2010;Westbrook et al., 2002). The BATS program organized the BLAST output files into four groups G0, G1, G2 and, G3. In this work, only proteins from G2 group (E-value 10 Â 10 À5 , identity ! 25% and LVI 0.7. LVI: Length Variation Index) classified as 'Very high' and 'High' were selected to the next steps, since the other groups with the 3D structures with identity < 25%, did not fit into the comparative modelling technique in MHOLline program.

Prediction of the essentiality of S. marcescens proteins
The protein essentiality is crucial for the organism survival and consequently this information is the basis for the vaccine candidates and drug targets development. Here, we used DEG, the Database of Essential Genes (http://tubic.tju. edu.cn/deg/), which includes all essential bacterial and eukaryotic gene records, to access the essentiality of S. marcescens selected proteins for both vaccines and drug targets (Zhang et al., 2004). Basic parameters used for BLASTP against DEG were set as default including bit score of 100 and E-value cut-off of 1 Â 10 À6 .

Gut flora non-homology analysis
The gut microbiota has a crucial role in humans providing resistance to colonization of pathogens and opportunistic bacteria, but some drug interactions with gut microbiota can induce toxicity and a reduced bioavailability of the drug. Then, we subjected the selected drug targets to homology search against each of the gut flora proteome using PBIT -Pipeline Builder For Identification of Targets (http://www. pbit.bicnirrh.res.in/) using BLASTP with default parameter (Evalue >0.005 and sequence identity < 50%) (Shende et al., 2016).

Structure analysis and protein preparation
Nowadays, drug discovery and development depend on the detection of compounds (ligands) with low-molecular weight in order to interact with disease-concerned biological molecules known as receptors and molecular targets (Ferreira et al., 2015). Structure-based functional annotations of hypothetical protein targets are considered as more sound technique than a sequence-based approach (Kumar et al., 2015). We used multiple tools for the analysis of our identified hypothetical proteins as drug targets. DoGSiteScorer (Volkamer et al., 2012) from the Protein Plus web portal (https://proteins.plus/) (F€ ahrrolfes et al., 2017) was used to identify the active site residues (i.e. the potential binding pocket) (Supplemental Material, Table S1). DoGSiteScorer uses a grid-based method which uses a difference of Gaussian filter to detect potential binding pockets based on the protein 3D structure (Volkamer et al., 2012). Furthermore, DALI -Protein Structure Comparison server (http://ekhidna2. biocenter.helsinki.fi/dali) was used for the function prediction based on 3D structure comparison (Holm, 2020). MGLTools package version 1.5.7 was used to prepare the protein targets and Autodock Tool (MGLTool package version 1.5.7) was used to add the charges and convert files to .pdbqt format (supported by Autodock Vina) and grid box.

Preparation of ligand library
5008 drug-like molecules (Natural products and their derivatives) were downloaded from the ZINC database (Irwin et al., 2012) and prepared for the docking analysis according to the Vilela Rodrigues et al. (2019).

Molecular docking and virtual screening
AutoDock Vina, the most cited molecular docking software program was used to perform docking analysis (Trott & Olson, 2010). Virtual screening was performed with the help of Python script. Chimera visualizing software (Pettersen et al., 2004) was used to analyze the binding and also for the extraction of 3-dimensional images. Pose View was used for 2-dimensional image extraction (Stierand et al., 2006).

Molecular dynamics method (MD)
Both receptors in their APO form, as well as their correspondent complexes from docking studies were performed molecular dynamics with 100 ns using software GROMACS 2021 (Abraham et al., 2015). First, the complexes were prepared using the CHARMM-GUI (Jo et al., 2008) tool in order to parametrize the receptor amino acids and ligand atoms according the CHARMM36-2021 forcefield (Huang & MacKerell, 2013). After, receptors and ligands were checked for possible clashes using the UCSF-Chimera program (Pettersen et al., 2004).
Receptors and complexes were simulated using the GMX module. The receptor and complex were loaded and the ligand parameters were generated by the accessory programs Avogadro (Hanwell et al., 2012) and CgenFF (Vanommeslaeghe et al., 2010). Then we followed the molecular dynamics protocol according to the steps: (i) energy minimization, (ii) one step of temperature equilibration (NVT) for 100 ps, (iii) one step of pressure equilibration (NPT) for 100 ps, (iv) a production molecular dynamic (MD) for 100 ns, followed by the analysis step. For the MD analysis we used the GMX module for generating RMSD, RMSF, HBOND and Energy files which were load in the GRACE program for generating their respective graphs.

Results
The workflow showed in Figure 1 summarizes the sequence of steps for drug and vaccine target selection with the methodologies used and the total number of proteins used in this work.

Identification of intra-species conserved non-host homologous proteins
We compared 59 genomes of S. marcescens strains (Table 1) to perform the OrthoFinder analysis. 2574 CDSs are shared by all strains and corresponds to the core genome. Among these CDSs, considering the human genome as the host, we found that 759 are non-host homologous proteins.

Prediction of vaccine candidates for Serratia marcescens
To prospect the possible vaccine, firstly, we predicted the protein subcellular localization using SurfG þ software (Barinov et al., 2009). From 759 non-host homologous proteins, 87 are putative surface-exposed proteins, 183 secreted proteins, and 80 membrane proteins (Figure 1). These proteins were submitted for analysis in Vaxign (He et al., 2010) to effectively predict the vaccine candidates for S. marcescens. We identified 28 proteins as possible vaccine candidates, which were analyzed using DEG (Zhang et al., 2004) to identify those that are essential to S. marcescens survival. We found 7 essential proteins that are described in Table 2 with their respective antigenic features. Two of them are transglycosylases (BAO35374.1 and BAO33631.1), an outer membrane lipoprotein (BAO35734.1), three transporters (BAO36572.1, BAO35526.1 and BAO36204.1), and a hypothetical protein containing DUF481 domain (BAO32244.1). Four candidates are secreted proteins while three are putative surfaceexposed. All they have signal peptides according to SignalP and one or none transmembrane helices according to TMHMM (He et al., 2010;Ross et al., 2018). The prediction of some functional domains was performed using InterProScan (Mitchell et al., 2019) and the protein name and molecular weight using UniProt (Consortium, 2018). Non-host homologous proteins are described in Table 2. All vaccine candidates have a molecular weight below 100 kDa. These characteristics highlight their high potential to be good vaccine candidates.

Drug target prediction and gut flora non-homology analysis
Between the 759 non-host homologous proteins, 409 are predicted as cytoplasmatic by SurfG þ software and these protein sequences were submitted to MHOLline platform. Only those proteins from the G2 group were selected. Then, 48 proteins with 'Very High' and 53 with 'High' classification were selected by the essentiality analysis with DEG. A total of 67 proteins were predicted as essential for S. marcescens, according to the parameters described in the methodology.
Next, we submitted these proteins to PIBIT to select those that would not have homology with the proteomes of human gut microbiota. Only two proteins (BAO33969.1 and BAO34294.1) were considered non-homologous (Table 3) and were used for druggability and docking analysis. Once they are hypothetical proteins and aiming to infer more information about them we identified structurally similar protein to our hypothetical proteins (Table 3) from DALI Server (Holm, 2020). The result of the DALI server designated a notable similarity for our targets BAO33969.1 and BAO34294.1 with the Putative Antibiotic Biosynthesis Monooxygenase (Z score ¼ 9.5) and Activating Signal Cointegrator (Z score ¼ 7.1) respectively (Supplemental Material, Table S2).

Molecular docking and virtual screening
Molecular docking analysis is a widely used approach for structure-based drug designing. Structure-based drug designing is applied to identify the most favorable 3D conformation of the ligand inside the binding/active sites of targets and to give quantitative projections of energy differentiation in the intermolecular identification process (dos Santos et al., 2018).
In this work, we used Autodock Vina (Trott & Olson, 2010) for molecular docking analysis. Nowadays in drug development, natural products are playing an important role and 5008 drug-like molecules (natural products and its derivatives) were used for the docking analysis for our identified protein targets and virtual screening was performed. Furthermore, top 10 molecules based on the binding energy were selected for detailed analysis (Supplemental Material, Table S3). After a detailed analysis of binding energy for the identified top molecules and targets (protein-ligand interaction), the best ligand with the targets was displayed with their ZINC/PubChem compound ID, molecular weight, AutoDock Vina binding affinity, number of hydrogen bonds, and interacted residues (Supplemental Material, Table S4). Chimera was used to identify the 3D and Pose View was used for the 2D image extraction (Figures 2 and 3). Based on our structure comparison of targets BAO33969.1 with its template structure (PDBID -2OKQ, Crystal structure  Table 2. Predicted vaccine targets for S. marcescens. From Vaxign were predicted the number of transmembrane helix (TMHMM) and MHC-I and MHC-II adhesion probability. The subcellular location was obtained from SurfG þ software (PSE -putative surface exposed; SEC -secreted; MEM -membrane). Signal-P indicates the presence and localization of signal peptides in amino acids sequence. Protein ID, Locus tag and Gene are obtained from NCBI Genbank database. The protein name, molecular weight (MW) and amino acids length (AA) are obtained from Uniprot database.  Figures S5 and S6), we identified the active site residues from the online server DoGSiteScorer (Volkamer et al., 2012). The druggable pocket predicted through DoGSiteScorer and comprising active site residues identified through of comparative analysis of template and model structure were used for docking. By performing molecular docking and virtual screening of target BAO33969.1, we identified drug like molecule ZINC04235390/CID11883985 interacted with the amino acid residues GLU40 and TRP42. Interestingly, the interacting amino acid residue GLU40 was belonged to active site residue identified from DogSiteScorer Potential Binding Pocket. Figure 2 shows the binding interaction 3D and 2D representation of target BAO33969.1 with compound ZINC04235390/CID11883985. Drug target BAO34294.1 showed interaction with drug like molecule ZINC04259491/ CID11886074 with amino acid residues ARG3, GLU4, THR6 identified as active site residue in our active site residues (Potential Binding Pocket) analysis. Figure 3 shows the binding interaction 3D and 2D representation of target BAO34294.1 with compound ZINC04259491/CID11886074.

Molecular dynamics analysis
The 100 ns molecular dynamics simulations for both complexes described the protein-ligand behavior along the time, including their interaction energy values as well as the ligand Table 3. Predicted drug targets for S. marcescens. Protein ID and Locus tag was obtained from NCBI Genbank database. The structure quality was obtained from MHOLline is showed and the name, gene, molecular weight (MW) and amino acids length (AA) was obtained from UniProt database. The structure similarity prediction for functions was obtained from DALI server. The major protein domains were obtained from InterPro.  stability inside the binding pocket described in docking studies. For the complex BAO33969.1 and ZINC04235390, the ligand ZINC04235390 ranged its RMSD between 0.2 and 2.7 nm during the 100 ns simulation, inside the BAO33969.1 receptor active site (Figure 4(A)). As can be seen, this ligand present RMSD stability around 0.5 nm up to 40 ns of simulation, and then it changes its conformation abruptly after 50 ns, which could indicate that this molecule moved out of the active site after this simulation time.
On the other hand, the RMS fluctuations of the BAO33969.1 receptor residues simulated alone (APO) and complexed with the ligand, did not presented any reduction in the active site amino acid RMS fluctuation values ( Figure  4(B)). Interestingly, the complex BAO33969.1-ZINC04235390 presented higher fluctuation in comparison to the APO form. This fact could be explained by the nature of this ligand interaction with the receptor active site, with two possible transitory hydrogen bonds with the amino acids Glu40 and Trp42 and two aromatic interactions with Ala23 and Val39. Furthermore, the number of hydrogen bonds varied from zero to three. In the Figure 4(C), the graph shows a most frequent number of hydrogen bonds formed until 40 ns of simulation, which could be explained by the ligand RMSD variation at the same time. In this case is possible assume that this ligand stopped its interactions with the active site amino acids Glu40 and Trp42 and moved out to interact with other surrounding amino acids.
The coulombic short-range interaction energy between the ligand ZINC04235390 and the receptor BAO33969.1 followed the same behavior as the other analyses of RMSD, RMSF and HBOND interactions, with its lowest energy values up to 40 ns of simulation time (Figure 4(D)). From 0 to 40 ns of simulation the energy had an average of À25 KJ/mol, and after this time we verified an abrupt energy decrease around À125 KJ/mol. This energy decreasing could be explained by the ligand change conformation which possibly moved out from the active site.
For the complex BAO34294.1 and ZINC04259491, the ligand ZINC04259491 RMSD ranged from 0.25 to 2.0 nm, and its average value persisted below 1.0 nm up to 85 ns of simulation time. In the Figure 5(A) we can see that this presented its lowest RMSD values between 50 and 55 ns, as well as between 80 and 85 ns. Furthermore, this molecule remained with its average RMSD below 0.7 nm up to 50 ns. After 85 ns, the ligand changed its RMSD abruptly, which could indicate that this molecule mover out from the active pocket of the receptor BAO34294.1 or reduced its interaction.
The RMS fluctuation of the receptor BAO34294.1 complexed with the ligand ZINC04259491 ( Figure 5(B)) shows that the overall structure had a significant reduction in the RMSF values in comparison with the APO form of the BAO34294.1 receptor. Additionally, in the active site region between the residues one to six, including the previous described Arg3, Glu4 and Thr6, we found a reduction of 0.5 nm in the RMSF of these residues, as well as a reduction of 0.6 nm in the region of the amino acid Try88. This fact could indicate that this ligand performed a strong interaction inside the active pocket which was capable to block the amino acid movements and suggesting an inhibition mechanism. Additionally, it can be seen in the hydrogen bond graph (Figure 5(C)) that this ligand performed a great number of hydrogen bonds during all the simulation time, ranging from one to six. Thus, in this aspect we found an increased number of hydrogen bonds between 25 and 80 ns. This fact could be related to a great number of polar groups with hydrogen donors and acceptors from the ligand ZINC04259491.
The interaction energy analysis of the complex BAO34294.1-ZINC04259491 showed stability after 25 ns until de end of the simulation time of 100 ns and presented an average coulombic short-range energy around À50 KJ/mol ( Figure 5(D)). In fact, these values are related to the nature of the ligand chemical characteristics which was capable to as strong interaction inside the active site, mainly by the number of hydrogen bonds and polar interactions.

Discussion
Infections caused by drug-resistant bacteria are one of the biggest public health problems nowadays. These pathogens have the ability to spread very quickly, especially in Intensive Care hospital environments and affect mainly immunocompromised patients, neonatal and others (Resistance, 2019). S. marcescens is an opportunistic nosocomial Enterobacteriaceae that has a high potential for dissemination and is intrinsically resistant to several antibiotics (Mahlen, 2011;Wilkowske et al., 1970). Recently, the options of treatment for this bacteria have become fewer but, despite this, only now it is being studied more deeply. Due to the difficulty in controlling its spread, it was included in the WHO critical list of research priorities for the development of new antibiotics (WHO, 2017). Several vaccine candidates are being studied for organisms with antimicrobial resistance including those that demand extensive antibiotic use or nosocomial infections (Buchy et al., 2020). So, in this work, we seek to predict new targets for vaccines and also for drugs against S. marcescens. For that purpose, from 59 S. marcescens complete genomes, we applied the comparative genomics and reverse vaccinology (Figure 1). Considering that the core genome is composed of CDS shared by all 59 strains, we found that 2574 proteins belong to the core genome. This means that any protein that will be targeted for vaccine or drug is likely to be effective in the various strains studied. This is especially important when it comes to resistant multidrug strains. To eliminate those proteins that could be homologous with the human proteins and that could possibly cause some adverse effect on the host (Duffield et al., 2010), we used subtractive genomics. We selected 759 proteins non-homologous to host which were identified as to their subcellular location, other important selection criteria to predict vaccines and drug targets. Cytoplasmic proteins are more favorable as drug targets because they are frequently involved with the essential metabolic processes while proteins more accessible to the pathogen, such as secretory, membrane, or PSE proteins are better targeted for vaccines (Duffield et al., 2010;Vilela Rodrigues et al., 2019), thus, the proteins were directed to next analysis using these criteria.
The adaptive antibody-and cell-mediated immune response are carried out by B and T cells, respectively (Singh & Mishra, 2016). In S. marcescens, it was described that its infection induces immune T-cells to activate the phagocytic and microbicidal functions of macrophages in the host (Kumagai et al., 1992). T cells are phenotypically classified into CD8 þ (cytotoxic T cells) and CD4 þ (T helper) lymphocytes. T-cell epitopes are presented on the surface of an antigen-presenting cell (APC), where they are bound to MHC molecules in order to induce an immune response. So, proteins that bind MHC are a prerequisite for T cell recognition and is a preferred vaccine target (Patronov & Doytchinova, 2013). The adherence is also crucial to bacterial pathogenicity and several adhesins have been shown to give protection in animal models, suggesting them to be potential candidates in protein based vaccines (Kline et al., 2009). Besides that, the presence of more than one transmembrane helix in a protein increases the chances of failure in the isolation and purification of the recombinant protein, making it difficult to produce vaccines on a large scale (He et al., 2010). For the reverse vaccinology using Vaxign, 28 proteins were selected and 7 were considered as essential to the survival of bacteria (Table 2) by DEG analysis (Xiang & He, 2009). All these proteins attend the criteria for vaccine candidates: sequence conservation among different genomes, sequence exclusion from the human genome, high adhesin probability (>0.7), low number of transmembrane helices (<1), presence of signal peptides and essentiality (Table 3). In addition, all our vaccine candidates have a molecular weight between 19-80 kDa, which are important because these proteins are easier to be purified contributing to the production process too (Naz et al., 2015).
Among our vaccine candidates there are some enzymes. Peptidoglycan (or murein) is a bacterial polysaccharide polymer chains of N-acetylglucosamine (GlcNAc) and N-acetylmuramic acid (MurNAc) composing the Gram-negative cell wall and forms the murein sacculus, which completely encloses the bacterial cell, maintaining the shape and size of the cell, providing mechanical support and protecting the bacterium from osmotic lysis. Since peptidoglycan is essential and unique to bacteria, it is target to many antibiotics, including b-lactams (Dom ınguez- Gil et al., 2016;Koraimann, 2003). Two proteins identified in this work as vaccine targets are transglycosylases related to murein cleavage: a membrane-bound lytic murein transglycosylase A (BAO35374.1) and a peptidoglycan lytic exo-transglycosylase (BAO33631.1). Both enzymes cleave the b-(1, 4)-glycosidic bond between the MurNAc and GlcNAc residues in the peptidoglycan, forming non-reducing 1,6-anhydrous-muropeptides. Through a genomics-based approach to vaccine discovery against meningococcal infection a GNA33 lipoprotein showed good results to vaccine target (Pizza et al., 2000) and was recognized that it encodes a lipoprotein homologous to the membrane-bound lytic transglycosylase A (MltA) from Escherichia coli (Jennings et al., 2002), similar to BAO35374.1found here.
During host-pathogen interaction, outer membrane (OM) proteins play a major role in adhesion to and invasion of host cells. Nowadays, the immunization strategies use OM proteins as antigens due to the low risks of the pathogen to recover its virulence and for showing few adverse effects as compared to live attenuated or killed whole-cell vaccination (Solanki et al., 2018). A lipoprotein screened as a vaccine target is divisome-associated lipoprotein YraP (BAO35734.1). YraP is predicted to be an OM lipoprotein consisting of OsmY and nodulation (BON) domains. Recently, it was characterized experimentally in E. coli as an OM-localized component of the cytokinetic ring, which is required for proper NlpD activity, which directly participates in the division process (Tsang et al., 2017). An important fact about this YraP lipoprotein is its homology with GNA2091 in Neisseria meningitidis, which forms part of the licensed N. meningitidis 4CMenB vaccine, prospected with reverse vaccinology (Serruto et al., 2012). GNA2091 was included in the 4CMenB vaccine for its ability to induce protection in a mouse intraperitoneal infection model and possibly has an immune-stimulating activity that could function as an adjuvant in the 4CMenB vaccine (Bos et al., 2014).
It is known that free iron concentration in the human host is very low to support bacterial growth, so, efficient iron-acquisition systems constitute important virulence factors of pathogenic bacteria (Stork et al., 2010). TonB-dependent receptor proteins binds specific substrates, which includes iron siderophores and vitamin B12, and allow its active transport into periplasm proteins (Noinaj et al., 2010). TonB family is involved with virulence of many pathogens and consequently represents a potential vaccine target (Locher et al., 1998;Reeves et al., 2000;Torres et al., 2001). In this work, we found vitamin B12 TonB-dependent receptor (BAO36572.1) and ferrichrome porin FhuA (BAO35526.1) as vaccine candidates. In Brucella melitensis, the BtuB protein, a BAO36572.1ortolog was already found as a possible vaccine candidate. A protein cocktail including the BtuB protein stimulated a mixed Th1-Th2 response in splenocytes from immunized mice (Gomez et al., 2013). The ferrichrome porin FhuA is an outer membrane protein involved in the uptake of ferric iron in complex with ferrichrome, a hydroxamatetype siderophore and are structurally very similar among Enterobacteriaceae. FhuA proteins are targets of several antibiotics (i. e. rifamycin, albomycin) and were used for the design of specific antigens that can be used as vaccines against E. coli and Acinetobacter baumannii since FhuA showed to have conserved epitopes, a property desired for putative vaccine candidates (Locher et al., 1998;Fereshteh et al., 2020;Farahmandian et al., 2016). Another transporter that we found as vaccine candidate was sn-glycerol-3-phosphate-binding periplasmic protein UgpB (BAO36204.1). It is a solute binding protein of UgpABCDE system that belongs to ABC transporter family which mediates the uptake of sn-glycerol-3-phosphate (G3P) during the de novo biosynthesis of the plasma membrane. These proteins are exclusively found in prokaryotes and are involved in nutrient transportation as well as bacterial pathogenicity (Adhikari et al., 2017). Because some ABC importers play a role in bacterial virulence and survival, the components of these importers emerge as ideal targets for the development of active attenuated antibacterial vaccines (Soni et al., 2020). For instance, UgpB was reported as a promising vaccine candidate against Mycobacterium tuberculosis (Zvi et al., 2008).
Domains Unknown Function (DUF) are hypothetical proteins whose experimental existence of protein is available but not related to any known functional or structural domain (Varma et al., 2015). In our analyses, the vaccine candidate with the best probability of adhesion to the MHC complex (0.909) was the DUF481 domain-containing protein (BAO32244.1) that has not been characterized yet. According to Transporter Classification Database (TCDB; www.tcdb.org), the DUF481 family consists of large numbers of proteins from a wide range of Gram-negative bacteria and usually have an N-terminal signal sequence as well as 10 À 12 putative TM b-strands. An E. coli K12 member of the family is YdiY, a probable outer membrane (Stancik et al., 2002).
MHOLline generated the best quality 3D structures of 101 cytoplasmic proteins, of which 67 were considered as essential by DEG. From these, only two proteins did not show homology with the proteomes of the human gut microbiota: DUF1428 family protein (BAO33969.1) and N(4)-acetylcytidine amidohydrolase (BAO34294.1) ( Table 3). The N(4)-acetylcytidine amidohydrolase was recently characterized in E.coli (YqfB protein) and inferred by homology in UniProt database to S. marcescens. This enzyme catalyzes the hydrolysis of N4acetylcytidine (ac4C) and contains the human activating signal cointegrator homology (ASCH) domain (Stanislauskien _ e et al., 2020). No function was attributed to ASCH domain but it was already suggested that it could be responsible for RNA binding during transcription activation, RNA processing, and regulation of translation (Iyer et al., 2006). The ac4C participate in priming and activation of the NLRC4 inflammasome, inducing interleukin-1b (IL-1b) production (Furman et al., 2017). According to DALI server that predicts the function based on 3D structure comparison, BAO34294.1 showed structure similarity with an activating signal cointegrator protein, corroborating with the information presented. We found little information about BAO33969.1 since it is uncharacterized yet. DALI server indicated structural similarity with a putative antibiotic biosynthesis monooxygenase (ABM). ABM family members present moderate sequence homology and catalyze many different chemical reactions in biological processes, including metabolism, transcription, translation, and biosynthesis of secondary metabolites, but share a high degree of structural similarity and probably the mechanism of action too, as demonstrated by the exclusive use of molecular oxygen in the oxidation reactions they catalyze (Bab-Dinitz et al., 2006;Henrick & Hirshberg, 2012).
Molecular docking is an established in silico structurebased method widely used in drug discovery because it identifies novel compounds of therapeutic interest, predicting ligand-target interactions at a molecular level without necessarily knowing the chemical structure of targets (Pinzi & Rastelli, 2019). From ZINC database, a selection of natural compounds are tested for their efficacy in binding to drug targets based mainly in its energy levels, where lower energy levels and other parameters means greater interaction capacities (Thomsen & Christensen, 2006). Here, using molecular docking and virtual screening we identified drug like molecule ZINC04235390/CID11883985 as better interactor BAO33969.1 target. The drug target BAO34294.1 showed the interaction with drug like molecule ZINC04259491/ CID11886074. Moreover, during MD simulation, the complex BAO33969.1 and ZINC04235390 was seen to be stable until 40 ns with up to three hydrogen bonds and energy helping this stability. Also, the complex BAO34294.1 and ZINC04259491 persisted until 85 ns and obtained a strong interaction with the active site, mainly between amino acids one to six, reinforced by several hydrogen bonds throughout the simulation. Therefore, our results indicate that these complexes can strongly bind and stably, therefore being good candidates for new drug development against S. marcescens. Despite that, the drug targets that we predict in this work are poorly studied and a deeper characterization is necessary to better understand their role in biological processes. However, such proteins have met the requirements for an efficient antimicrobial activity against S. marcescens.

Conclusions
We present seven proteins that are possible vaccine candidates and many of them have already been indicated as good targets for vaccines in other organisms. This gives us credibility in our methodology and indicates that a vaccine could be developed for a multibacterial defense. But more rigorous characterization is necessary for this to be possible. Also addressing the need to develop new drugs that help control the outbreak of resistant multidrug bacteria, we present two possible drug candidates for infections with S.
marcescens. This new information may speed up the knowledge about the treatment of these nosocomial infections.

Acknowledgements
To PhD. Eduardo Costa Almeida for the logistics of submitting the molecular dynamics calculations conducted on the supercomputer of NBCGIB -Center for Computational Biology and Biotechnological Information Management at the State University of Santa Cruz (UESC).

Data accessibility
The genomes are available at GenBank as described in Table 1

Disclosure statement
We declare we have no competing interests.
Authors' contributions L.C.S.P. carried out all the data analysis and participated in conceiving and designing the study and drafted the manuscript; A.K.J. participated in molecular docking analyses and helped draft the manuscript; A.G.F. and T.C.V.R. carried out the designing the study, data analyses and reviewed the manuscript; B.S.A and R.B.K. carried out the molecular dynamic analysis. S.T, D.B., V.A., M.V.S. and C.J.F.O. reviewed the manuscript and the data for confirmation of results and gave insights; S.C.S. conceived, designed and coordinated the study, and helped draft the manuscript. All the authors gave their final approval for publication.