Computational design of candidate multi-epitope vaccine against SARS-CoV-2 targeting structural (S and N) and non-structural (NSP3 and NSP12) proteins

ABSTRACT: The COVID-19 pandemic caused by SARS-CoV-2 virus has created a global damage and has exposed the vulnerable side of scientific research towards novel diseases. The intensity of the pandemic is huge, with mortality rates of more than 6 million people worldwide in a span of 2 years. Considering the gravity of the situation, scientists all across the world are continuously attempting to create successful therapeutic solutions to combat the virus. Various vaccination strategies are being devised to ensure effective immunization against SARS-CoV-2 infection. SARS-CoV-2 spreads very rapidly, and the infection rate is remarkably high than other respiratory tract viruses. The viral entry and recognition of the host cell is facilitated by S protein of the virus. N protein along with NSP3 is majorly responsible for viral genome assembly and NSP12 performs polymerase activity for RNA synthesis. In this study, we have designed a multi-epitope, chimeric vaccine considering the two structural (S and N protein) and two non-structural proteins (NSP3 and NSP12) of SARS-CoV-2 virus. The aim is to induce immune response by generating antibodies against these proteins to target the viral entry and viral replication in the host cell. In this study, computational tools were used, and the reliability of the vaccine was verified using molecular docking, molecular dynamics simulation and immune simulation studies in silico. These studies demonstrate that the vaccine designed shows steady interaction with Toll like receptors with good stability and will be effective in inducing a strong and specific immune response in the body. Communicated by Ramaswamy H. Sarma


Introduction
The highly contagious COVID-19 pandemic caused by SARS-CoV-2 virus has had a disastrous impact on the world's population, resulting in deaths of over 6 million people globally, emerging as the world's most serious health emergency till date (Cascella et al., 2022).The pandemic led to loss of livelihoods as a result of extended lockdowns, which has had a detrimental effect on the worldwide crisis.COVID-19 has a widespread clinical range spanning from asymptomatic to severe respiratory illness that needs hospitalization and critical care.A variety of factors influence the duration and severity of the disease, including age and the prevalence of other underlying comorbidities (Sun et al., 2020).The infectious nature of the virus, a lack of thorough understanding of the disease's course, and lack of an efficient cure have contributed to this dreadful situation (Park, 2020).Despite significant advances in research leading to a better grasp of SARS-CoV-2 and COVID-19 management, restricting the virus and its variants' persistent spread has been a growing challenge as SARS-CoV-2 continues to cause havoc throughout the world (Cascella et al., 2022).
SARS-CoV-2 is a spherical, enveloped virus with a diameter of 80-120 nm with several outwardly projecting club-like homo-trimeric S proteins, giving it the appearance of a solar corona.The S protein or the spike protein shows a varied range of conservancy across the Coronaviridae family and is majorly responsible for the first stage of infection, that is invasion of host cell after its recognition (Tortorici et al., 2020;Yadav et al., 2021).This S protein is cleaved into two subunits namely, S1 and S2 by furin protease of the host cell, after which the S1 subunit (containing the Receptor Binding Domain) attaches to the ACE2 receptor on host cells (Hoffmann et al., 2020).This leads to the fusion of viral and host cell membranes, enabling viral entry and infection (Walls et al., 2020).Findings from several studies show that most antibodies isolated from an infected patient were against the RBD domain of S protein (Chi et al., 2020;Liu et al., 2020;Piccoli et al., 2020).Given its efficiency in antibody generation in the body, S protein is a major target for vaccine designing.Another structural protein, the nucleocapsid or the N protein, plays a multi-faceted role in SARS-CoV-2 infection cycle (McBride et al., 2014).The N protein contains three functional domains (N-terminal domain, C-terminal domain and linker or RNA-binding domain), which altogether facilitate RNA binding and triggers a structural dynamism that enhances the affinity of the viral RNA to non-viral or host RNA (Chang et al., 2006).In addition, the N protein also participates in viral genome assembly and viral genome replication along with NSP3 protein (Cong et al., 2020).Considering the immunogenic nature and pivotal role of N protein in viral genome replication and packaging, it becomes an important target for vaccine designing (Yadav et al., 2021).
Along with the structural proteins of SARS-CoV-2 virus, the non-structural proteins (NSP) proteins are essential for the formation of the viral replicase complex, the formation of the double-membrane vesicle, and the regulation of host cellular pathways.Among all the non-structural proteins, NSP3 is the largest protein in the SARS-CoV-2 proteome, with 1945 amino acids, comprising of 16 domains (Yoshimoto, 2020).In SARS-CoV-2 and similar SARS-CoVs, NSP3 has a papain-like protease domain (PLpro) that cleaves polyprotein 1a(b) to release NSP1 & NSP2 (Freitas et al., 2020).NSP3 proteins' macro domain is also engaged in disrupting the expression of innate immunity gene (Fehr et al., 2016).Its de-ubiquitylation activity targets proteins involved in type I interferon and NFjB inflammatory signaling (Matthews et al., 2014).In addition to these properties of NSP3 protein, its participation in viral assembly alongside N protein is noteworthy.The SARS-CoV-2 RdRp (also known as NSP12) is an important part of the viral replication/transcription machinery and contains 932 amino acids.The viral genome is replicated by two RNAdependent RNA polymerase (RdRps) (Aftab et al., 2020).NSP12 interacts with other non-structural proteins like NSP14 to ensure consistency in RNA synthesis (Kirchdoerfer & Ward, 2019).Though it is capable of performing polymerase activity by its own, the presence of NSP7 and NSP8 proteins along with NSP12 acts as stimulating factors for the polymerase activity for mediating viral RNA synthesis (Peng et al., 2020).Therefore, given the importance in viral replication/transcription mechanism and viral assembly, NSP12 or RdRp as well as NSP3 are considered major targets for designing therapeutics like vaccine (Yadav et al., 2021, Littler et al., 2020).
The urgent need for a vaccination against the SARS-CoV-2 virus has been recognized as a significant problem.Effective vaccination might have a crucial role in limiting the virus's transmission and eventually eradicating it from the human population (Ahmed et al., 2020).Studies on various structural and non-structural proteins facilitate the development of novel vaccination techniques that are specific and structuredriven (Mariano et al., 2020).The main aim of prophylactic vaccine development against SARS-CoV-2 is to restrict the virus entry pathway, which is achieved by targeting the S and N structural proteins of the virus (Peng et al., 2021).Studies also show that non-structural proteins (like NSP3 and NSP12) are mostly responsible for viral assembly and RNA synthesis, which also provides the target for potential therapeutic strategies (Hardenbrook & Zhang, 2022).Studies have shown that NSP3 and NSP12 as antigens can be responsible for more than 80% of total helper T cell and cytotoxic T cell responses in the body.Also, the highest number of IgM epitopes are encoded by NSP12 and NSP3 and highest number of IgG epitopes are encoded by NSP3 and S protein of SARS-CoV-2 virus (Tarke et al., 2021).
Vaccine strategies that encompasses the complete organism or big proteins leads to excessive antigenic load and high chances of allergic reactions in the body (Cheng et al., 2021).This challenge can be overcome by including only short stretches of peptides which are both antigenic and immunogenic in nature, but do not show any allergic reaction (Basak et al., 2021;Kar et al., 2020).With advancing computational techniques, the approach of vaccine designing can be made easier in a shorter time duration.Prediction and screening of immunogenic and antigenic cytotoxic T cell epitopes (CTL) and helper T cell epitopes (HTL) to design a potential multi-epitope vaccine construct can be effortlessly crafted using in silico tools.Therefore, in this study, we aim to design a multi-epitope, prophylactic vaccine construct targeting structural proteins like S and N protein and non-structural proteins like NSP3 and NSP12 proteins.
When the vaccine is injected, the body's immune system recognizes it, which is performed by pattern recognition receptors (PRRs) of the host cell which recognizes the foreign particle through pathogen-associated molecular patterns (PAMPs) (Palm & Medzhitov, 2009).The immune cells like dendritic cells, macrophages and neutrophils circulating in the body are sensitized by these PAMPs of the pathogen (in this case the vaccine) and elicit 'danger signals' through which the host cells are alarmed (Iwasaki & Medzhitov, 2010).Thereafter, pro-inflammatory cytokines and chemokines are produced by these cells, which eventually activate the monocytes, natural-killer cells and granulocytes.T cells and B cells are activated in lymph nodes by mature dendritic cells and interaction with naive T cells causes them to develop into regulatory CD4þ cells (Sakaguchi et al., 2010).Approximately during the same time-period, tiny fragments of the antigen are processed which interacts with the MHC receptor grooves on the cell surfaces.Generally, 9-mer antigenic fragments interact with MHC class I and 15-mer fragments interact with MHC class II receptors (Krogsgaard & Davis, 2005).Also, the cytokines secreted are responsible for the activation of B cells in order to generate neutralizing antibodies (Ahlers & Belyakov, 2010).A detailed representation of the immune action of the vaccine is depicted in Figure 1.

Vaccine construction, modelling and validation
The screened epitopes were linked together with GPGPG linkers and CTB adjuvant to construct linear vaccine sequence.Different arrangements of the selected epitopes were made to construct multiple vaccine constructs.All these constructs were subjected to homology modelling by trRosetta server (https://yanglab.nankai.edu.cn/trRosetta/help/ ; Yang et al., 2020).All the obtained models were checked for its validity using different tools.Z-score was predicted using ProSA webserver (https://prosa.services.came.sbg.ac.at/ prosa.php).Ramachandran plot was checked using Molprobity server (http://molprobity.biochem.duke.edu/)and the ERRAT score is predicted using SAVES server (https:// saves.mbi.ucla.edu/;Messaoudi et al., 2013;Wiederstein & Sippl, 2007;Williams et al., 2018).Based on the scores obtained, the best model was chosen for further studies.

Population coverage of the designed vaccine candidate
The population coverage of the designed vaccine candidate with respect to the entire world population and most frequently occurring HLA alleles was performed using the IEDB population coverage analysis tool (https://tools.iedb.org/population/).

Molecular dynamics simulation
Before running the MD simulations, the complexes generated in the docking studies were optimized using the default configuration of the quickprep tool of the MOEV R package (https://www.chemcomp.com/Products.htm), in order to optimize bond lengths and angles, calculate charges, and properly protonate residues according to the physiological pH.
The MD simulations were performed using the dynamics tool of the MOEV R package (https://www.chemcomp.com/Products.htm).Therefore, the Compute/Simulations/Dynamics path was used to prepare each system according to the parameters of the NAMD software (Nelson et al., 1996) and using the forcefield AMBER10:EH (Case et al., 2008), with cutoff of 10 for electrostatic and 8,10 for VdW interactions.The systems were centered in boxes containing around 40,000 and 110,000 water molecules for the TLR2 and TLR4 complexes and later neutralized with NaCl ions.The rounds of simulations involved 10 ps of an energy minimization step followed by 100 ps of NPT and 200 ps of NVT.After the equilibration, 150 ns of free MD simulation were performed.

Immune simulation study
The immunogenic profile of the vaccine candidate was assessed using C-IMMSIM immune server (https://kraken.iac.rm.cnr.it/C-IMMSIM/).C-IMMSIM is an agent-based model that includes machine learning to predict immunological interactions using position-specific scoring matrices (PSSM) (Rapin et al., 2010).For most vaccinations now in use, the minimum recommended period between the first and second dosage is 4 weeks (Castiglione et al., 2012).The entire simulation was conducted for 180 days.In order to evaluate the vaccination protocol's effectiveness, three vaccine injections were given on days 3, 30 and 60.

In silico cloning of the codon optimized vaccine construct
The vaccine component was optimized using the Java Codon Adaptation Tool (JCat) tool (http://www.jcat.de/).It was also reverse translated into cDNA sequence, which was inserted into the pET28a (þ) vector using the SnapGene software (https://www.snapgene.com).The output contains GC content as well as a codon adaptation index (CAI) score, which can be used to estimate protein expression levels in E. coli (Grote et al., 2005).

T cell epitope prediction
The presence of both cytotoxic T cells (CTL) and helper T cells (HTL) epitopes in a vaccine is essential for generating a strong immune response with longer endurance (Testa & Philip, 2012).The primary role of CTL epitopes is to eliminate the antigen as well as the infected cells in the body and HTL epitopes are actively responsible for triggering both humoral and cell-mediated immunity (Doherty et al., 1992;Panina-Bordignon et al., 1989).The CTL and HTL epitopes were predicted from the SARS-CoV-2 structural proteins like S protein and N protein and non-structural proteins like NSP3 and NSP12 proteins.The CTL epitopes were predicted by NetCTL 1.2 server along with IEDB consensus method (Supplementary Tables 1-4) and the NetMHC II pan 3.2 server was used to predict the HTL epitopes (Supplementary Tables 5-8).From the huge pool of predicted epitopes, best epitopes were selected based on high binding affinity with MHC class I and MHC class II alleles, high antigenicity scores and immunogenicity scores.A total of 8 CTL (2 epitopes from each protein) and 8 HTL epitopes (2 from each protein) were finalized based on the above-mentioned parameters, which are depicted in Tables 1 and 2. These finalized epitopes were considered for vaccine construction.

Vaccine construction, modelling and validation
The vaccine is constructed considering few parameters like: (a) the vaccine must show strong binding affinity with MHC I and MHC II alleles, (b) it must be antigenic, but not be an allergen, (c) should be exhibiting high immunogenicity and (d) it should be promiscuous.Based on these criteria, the linear vaccine construct was made, which comprised a total of 16 epitopes (8 CTL and 8 HTL) from S, N, NSP3 and NSP12 proteins.All the epitopes were linked together with a GPGPG linker as it prevents the formation of junctional epitopes.Studies show that this linker also facilitates effective immune processing of the vaccine (Saadi et al., 2017).
To assist in protective immunity as well as to prevent the autoimmune reactions in the body, Cholera Toxin B (CTB) adjuvant was included in the vaccine construct (Stratmann, 2015).The adjuvant was linked with the vaccine using EAAAK linker at the N-terminal end of the linear construct (Figure 2A).The linear sequence was then subjected to homology modelling using trRosetta server (Figure 2B), whose quality was validated by ProSA, ERRAT and Ramachandran plot analysis.The model obtained a Z-score of À 4.71 (Figure 2C), which indicates the model to be reliable as the score falls within the range of scores of comparable sized proteins (Wiederstein & Sippl, 2007).The ERRAT score was predicted to be 78.3051(Figure 2E) An ideal model should possess an ERRAT score more than 50, therefore the obtained score for our designed vaccine is suggestive of a good quality model (Messaoudi et al., 2013).The Ramachandran plot analysis shows that 98.13% of residues fell under favored region (preferred is more than 98%) and 0% residues were outliers (preferred is less Epitopes are considered good binders based on IC50 scores < 500 nm.To predict the antigenicity of the epitopes, VaxiJen v2.0 server was used.The immunogenicity was predicted using IEDB class I immunogenicity tool. than 0.05%).This also strongly validates the overall quality of the model (Figure 2D; Williams et al., 2018).

Prediction of antigenicity, immunogenicity and physicochemical properties of the vaccine
A vaccine must be antigenic as well as immunogenic.The immunogenic nature will help induce both cell-mediated and humoral immune responses in the body, while the antigenic nature will aid the body to recognize the vaccine as an antigen and trigger immune reaction against it (Ilinskaya & Dobrovolskaia, 2016).The antigenicity of the vaccine was predicted to be 0.4848 using VaxiJen v2.0 server.Proteins with scores above 0.4 are ideally recognized as antigens (Doytchinova & Flower, 2007).The immunogenicity as predicted by the IEDB class I immunogenicity server was 3.36686, where positive scores are indicative of the fact that the vaccine is immunogenic (Calis et al., 2013).The vaccine must not trigger any allergic or toxic responses in the body (Basak et al., 2021).Therefore, the allergenicity and toxicity of the vaccine were predicted by AllerTOP and ToxiPred servers, respectively.The results show that the vaccine is both Epitopes which have scores � 2.0 are strong binders to their respective alleles.To predict the antigenicity of the epitopes, VaxiJen v2.0 server was used.The immunogenicity was predicted using IEDB class I immunogenicity tool.
non-allergic and non-toxic.Physicochemical properties of the vaccine were also checked using ExPASy server to ensure safety and efficacy.As predicted by the server, the molecular weight of the vaccine is 39.62 kDa.The instability index of the vaccine was predicted to be 39.53, which states the protein to be stable (scores <40 is considered to be stable; Walker, 2005).Usually, a higher aliphatic index indicates a thermostable protein and the predicted aliphatic index of 67.34 of the vaccine construct confirms its thermostable nature (Walker, 2005).The vaccine also exhibits a GRAVY score of À 0.186, which indicates the hydrophilic nature of the vaccine (lower the GRAVY score, better is the solubility of the protein).This suggests that the vaccine will be effectively making interactions in an aqueous environment (Walker, 2005).The hydrophilic nature of the vaccine was further verified by SoluProt server, which predicts a score of 0.857 and any score above 0.5 is indicative of having a soluble expression in bacterial host (Hon et al., 2021).The halflife of the designed vaccine construct was 30 h in mammalian reticulocytes, > 20 h in yeast and >10 h E. coli (Walker, 2005).All of these properties predicted were consistent and comparable to the previous body of work in literature (Supplementary Table 10) (Safavi et al., 2019;Mahdevar et al., 2021a, b;Safavi et al., 2019).The absence of any transmembrane helices in the vaccine ensures that no difficulties would be anticipated during the expression of the vaccine during production (Supplementary Figure 1).Also, the lack of signal peptides in the vaccine construct indicates that protein localization is prevented (Supplementary Figure 2).

Population coverage analysis
Due to geographical and ethnic diversity, the prevalence and expression of HLA alleles may vary throughout the world.As a result, this study was done to check if the vaccine that's been designed would be effective for the entire world's population.The selected epitopes had a world population coverage of 100%.Amongst all the regions, a highest population coverage of 100% was observed in China, East Asia, Europe, India, Northeast Asia, Oceania, South America and United States whereas, the least population coverage of 99.98% was observed for Southeast Asia (Table 3).In conclusion, the data show that the new vaccine might be able to battle SARS-CoV-2 around the world (Supplementary Figure 3).

B cell, interleukin-2 epitope prediction
B cells control the humoral immune response in the body by differentiating into antibody secreting plasma cells (Jabbar et al., 2018).Though the B cell activation and differentiation is aided by T cells, the presence of B cell epitopes in the vaccine construct would ensure better efficacy in generating strong immune responses.Hence, continuous and discontinuous B cell epitopes were predicted from the vaccine construct using ElliPro server (Tables 4 and 5, respectively).In addition, the HTL and CTL epitopes in the vaccine construct were subjected to IL-2 epitope prediction to ensure its efficacy in inducing interleukin 2. The results depict that 6 out of 8 HTL epitopes and all 8 CTL epitopes are IL-2 inducers, thereby strengthening the effectiveness of the vaccine (Supplementary Table 9).

Docking with MHC class I and MHC class II receptors
The interaction of the designed vaccine with MHC class I and MHC class II receptors is essential for the activation of CTL and HTL epitopes.Hence, all the selected CTL and HTL epitopes were subjected to docking analysis with MHC class I and MHC class II receptors, respectively, using the Z-DOCK server.
The binding pattern of each epitope with their respective alleles were analyzed as depicted in Figure 3.

Docking of vaccine with TLR4 and TLR2
To ensure a stable immune response, the interaction of vaccine construct with Toll-Like Receptors (TLRs) is mandatory.
The TLRs form the first line of defence against infections and hence, is responsible for generating adaptive immunity The abbreviation 'pc' stands for population coverage.(Carty & Bowie, 2010).TLR4 and TLR2 have been related to the sensing of viral structural proteins, which triggers the production of inflammatory cytokines (Lester & Li, 2014, Khan et al., 2021).Therefore, the vaccine was subjected to docking analysis with TLR4 and TLR2 using HADDOCK 2.2 server.Among the many predicted clusters obtained from docking the vaccine with both TLR4 and TLR2, the best cluster with the lowest HADDOCK score were selected.These clusters were then subjected to refinement, after which the TLR4-vaccine complex showed a HADDOCK score of À 157.9 ± 1.6 a.u. and the TLR2-vaccine complex showed HADDOCK score of À 249.5 ± 5.4 a.u.As higher negative scores suggest better binding affinity, both the docked complexes indicate that vaccine exhibits good binding affinity with both TLR4 and TLR2.A low RMSD score of 0.6 ± 0.3 Å for both the docked structures imply them to be of good quality with least deviations (Supplementary Figures 4 and 6).The statistics obtained after refinement of the docked complexes are given in Tables 6 and 7.The 3D representation of the docked molecules along with few interactions are shown in Figures 4 and 5.The detailed interactions between the À 266.0 ± 15.7 Desolvation energy (kcal mol À 1 ) À 24.5 ± 1.0 Restraints violation energy (kcal mol À 1 ) 1.2 ± 0.5 Buried surface area (Å2) 2606.5 ± 47.1 The HADDOCK score represents strong protein interaction during docking, which is expressed in arbitrary units (a.u.).
Table 7. Statistics of refined docked vaccine and TLR2 complex.

Binding affinity studies and prediction of disulphide bonds
The binding energy or the Gibbs free energy (DG) is an essential factor to determine if an interaction will occur in cellular environment (Vangone et al., 2019).Therefore, the calculation of binding energy of the docked complexes was carried out using PRODIGY server.The predicted binding energies of TLR4 and TLR2 docked complexes with vaccine were À 11.4 and À 21.8 kcal/mol, respectively.The negative values of the Gibbs free energy suggest that these interactions are feasible in cellular environment (Table 8).Di-sulphide bonds are very crucial in maintaining and stabilizing a protein molecule (Karimi et al., 2016).Hence, the docked structures were also checked for the presence of any di-sulphide bonds in their interaction using Design 2.0 webserver.
Results show that the TLR2-vaccine complex contains 1 di-  sulphide bond between the two chains, whereas the TLR4vaccine complex contains 4 di-sulphide bonds (Table 9).The v3 denotes the torsional angle between the bonds and kcal/mol gives the bond energy.R B-factor denotes the summation of temperature factors between the docked complexes (Karimi et al., 2016).All the obtained results signify that both the structures will be experimentally stable proteins.

Molecular dynamics simulation
The molecular dynamics simulation was conducted for a time period of 150 ns and the total energy of the system was determined.It was seen that both TLR4-vaccine complex and TLR2vaccine complex achieved consistent low energy with minor variations during the whole simulated time-period (Supplementary Figures 8 and 9).This is indicative of the fact that both the complexes (TLR4-vaccine complex and TLR2vaccine complex) showed very good stabilization over the 150 ns time period.Using a 100 ps of NPT and 200 ps of NVT the impact of temperature, pressure, and other thermodynamic variables was also investigated.The NVT ensemble was used to see whether the systems were stable at the specified temperature of 310 K. Results indicated that both the systems reached 310 K soon enough and equilibration was maintained with very negligible fluctuation, throughout the process.Similarly, at the desired pressure, the NPT ensemble was evaluated for both of the system's stability at desired pressure.Even in this case the structural integrity of the systems was maintained with least fluctuations, denoting system's stability.After a run of 150 ns for TLR4-vaccine complex and TLR2-vaccine complex, trajectory analyses of the complexes were done to check the stability and flexibility.The RMSD plots showed little variations allowing them to achieve better stability over the nanoscale period.The RMSF plots revealed a few high peaks, which correlate to the high-flexibility regions of TLR-vaccine complexes.The graph plots of RMSD and RMSF of TLR-vaccine complexes are given in Figure 6.Additional plots of radius of gyration are given in Supplementary Figures 8 and  9, where it clearly indicates the compactness of the complexes over the simulation time-period of 150 ns.

Immune simulation study
To generate and analyze the in silico immunogenic response of our designed vaccine construct, immune simulation study was conducted using C-IMMSIM server.The vaccine was administered three times at regular intervals of 30 days.The resulted data showed that the immune response evoked during secondary and tertiary exposure of the vaccine was much stronger than compared to the primary response to the vaccine candidate (Supplementary Figure 10).This can be inferred from Figure 7A (i), that there is a significant decline in the antigen concentration during the second and third exposure, but the immunoglobulin activity is exponentially maintained.The production of stronger immune responses is additionally supported by high levels of cytokines such as IFN gamma and IL-2 as depicted in Figure 7A (ii).Also, in Figure 7A (iii) and (vi), we can visualize the increased B cell and B plasma cell population in response to the vaccine regime.Finally, the activity of cytotoxic and helper T cells in response to the vaccine also justifies its efficiency in generating a greater immune response in the body (Figure 7A (iv) and (v)).
The immune response generated is further compared using a positive control C5 (Singh et al., 2020;Wahome et al., 2012) and a similar vaccine candidate designed by Safavi et al. (2020).There was a significant decrease of antigen concentration observed for proposed vaccine candidate (Figure 7A (i)), positive control C5 (Figure 7B (i)) and the candidate designed by Safavi et al. with an increase in the antibody titers after the second and third injections (Safavi et al., 2020).The plot for cytokine concentrations showed that all the vaccine candidates (Figure 7A(ii), B (ii)) (Safavi et al., 2020) had enough IFN-c and IL-2.In this case, the IFN-c production was higher for C5 when compared to our proposed vaccine construct (Figure 7A (ii), B (ii)), which is consistent with the previous study by Safavi et al. (2020).There was a striking increase in the IL-2 production of our vaccine when compared to the positive control C5 (Figure 7A (ii), B (ii)) which is again consistent with the previous work (Safavi et al., 2020).The B cell populations were similar in all of the three cases (Figure 7A (iii) and B (iii); Safavi et al., 2020).However, when the T C cell populations were compared it was seen that the candidate vaccine designed by Safavi et al., performed slightly better with respect to both the designed vaccine in this study and the other positive control C5.Having said so it is worth mentioning that our designed vaccine candidate had all the desirable properties that has the capability to induce strong immune responses.Additional immune profiling were also performed and the graphs for both our designed vaccine and the positive control are provided in Supplementary Figures 10 and 11, respectively.

In silico cloning of the codon optimized vaccine construct
The vaccine was codon optimized by the JCat (Java Codon Adaptation) tool, for optimum production of the vaccine in E. coli, K-12 strain.The GC content of the adapted sequence is preferably in a range between 30 and 70 and for the optimized vaccine sequence the GC content was predicted to be 57%, which indicates that the vaccine will be efficiently expressed in E. coli host cell (Grote et al., 2005).Finally, a recombinant plasmid was designed by inserting the codon optimized vaccine fragment into the pET-28a (þ) vector by using SnapGene tool (Figure 8).pET-28a (þ) vector is a widely accepted system for large scale production of desired recombinant peptide.This vector also contains poly-His tags and thrombin protease recognition sites (TPS) which eases the purification of recombinant protein using standardized protocols (Shilling et al., 2020).Though it contains some design flaws which seldom affect controlled initiation of transcription and translation, the presence of multiple restriction sites makes it ideal for fragment insertion and expression (Safavi et al., 2019).Many research groups have used pET-28a (þ) vector for the above-mentioned reasons and hence pET-28a (þ) vector was used for this study (Mahdevar et al., 2021a;Safavi et al., 2019;Safavi et al., 2021).

Discussion
Immunoinformatics approaches have already been used by scientific groups to predict potential antigenic epitopes for the development of a multi-epitope vaccine candidate (Tosta et al., 2021).In comparison to traditional vaccines, multi-epitope vaccines prevent any unnecessary antigenic overload in the body and promote elicitation of a highly specific immune response (Zhang, 2018).Other advantages of multi-epitope vaccines are: presence of both CTL and HTL epitopes for generation of cell-mediated and humoral immunity as well as incorporating a known adjuvant into the vaccine to ensure strong immunogenic response (Lennerz et al., 2014;Zhu et al., 2014).In this present study, we have constructed a chimeric, multi-epitope vaccine candidate against SARS-CoV-2 virus, consisting of 2 structural proteins (S or spike protein and N or nucleocapsid protein) and 2 non-structural proteins (NSP3 and NSP12) by using in silico tools.The construct shows antigenic, immunogenic, non-allergenic and non-toxic properties, indicating that it will be efficiently inducing immune response and prevent allergic side-effects.Similar approaches have been previously used to design vaccines against Ebola virus (Shankar et al., 2021), Nipah virus (Ojha et al., 2019), Clostridium difficile (Basak et al., 2021), Chandipura Vesiculovirus (Deb et al., 2022), Klebsiella pneumoniae (Dar et al., 2019) and more.In addition, vaccines developed using this approach have shown successful immune response generation in vivo and have entered phase I of clinical trials (Guo et al., 2014;Slingluff et al., 2013).
The S protein is one of the most fundamental determinants in viral antigenicity and aids viral entry into the host cell (Yadav et al., 2021).N protein along with NSP3 is majorly responsible for viral genome replication and genome packaging (Cong et al., 2020) and the prime role of NSP12 is RNA synthesis of SARS-CoV-2 virus (Peng et al., 2020).Therefore, targeting four different SARS-CoV-2 proteins for designing a vaccine candidate will ensure an effective immune response generation against the virus as it aims to not only restrict the viral entry, but also to inhibit RNA synthesis and viral genome assembly.The designed vaccine contains cytotoxic T cells (CTL) and helper T cell (HTL) epitopes derived from all the four proteins.From the huge pool of predicted epitopes, they were screened based on few parameters, which are: should be both immunogenic and antigenic, must not be allergenic or toxic and should be able to bind with MHC alleles with high affinity.The selected epitopes are linked with GPGPG linkers along with which CTB adjuvant is added to ensure immunogenicity of the vaccine candidate.Studies show that using non-toxic CTB as a potent bacterial adjuvant in vaccines has successfully activated CD4þ T cell responses, resulting in an elevated immune response (Antonio-Herrera et al., 2018).Presence of B cell epitopes and interleukin-2 inducing epitopes were scanned from the linear vaccine construct.Results show that the vaccine contains sufficient number of B cell epitopes and IL-2 inducing epitopes, which further strengthens the vaccine's capability of immune response generation.The linear vaccine construct was also checked for its antigenicity, immunogenicity, allergenicity and toxicity using the VaxiJen v2.0, IEDB class I immunogenicity, AllerTOP and ToxiPred servers, respectively.The physicochemical properties of the vaccine were checked using ExPASy Protparam tool which predicted the molecular weight of the vaccine to be 39.62 kDa.The instability index and aliphatic index were predicted to be 39.53 and 67.34, which indicates the vaccine to be stable even in varied range of temperatures (Walker, 2005).The soluble and hydrophilic nature of the vaccine was assured by the solubility score of 0.857 (scores >0.5 are soluble) and GRAVY score of À 0.179 (lower the score, more is the hydrophilicity; Hon et al., 2021;Walker, 2005).The vaccine was then subjected to homology modelling and the best model was again subjected to certain predictions to validate the structure like Ramachandran plot analysis, Z-score analysis and ERRAT score.The vaccine showed desirable results for all these analyses and the tertiary structure was thereby validated.
While selecting the best possible epitopes for linear vaccine construction one of the most important criteria followed is, antigenicity (where an epitope with an antigenicity scores greater than 0.4 is considered to be antigenic based on scores predicted by VaxiJen v2.0 server).Apart from this, the best epitopes should be such that it should be immunogenic (where a positive immunogenicity score indicates the epitope to be immunogenic as per MHC I immunogenicity server by IEDB), bind to multiple HLA alleles with high affinity and should also be able to induce other secondary and tertiary immune responses.Based on these criteria the three best CTL and HTL epitopes.From the CTL epitopes, LSPRWYFYY (from N protein, position 104), TMADLVYAL (from NSP12 protein, position 123), SPRWYFYYL (from N protein, position 105), showed the highest antigenicity scores of 1.2832, 0.8208, 0.734, respectively (Table 1).Not only were these the best antigenic epitopes but also were immunogenic and had the capacity to bind to multiple HLA alleles (Table 1).Apart from these, the epitopes were also interleukin 2 inducing epitopes (Supplementary Table 9), thus increasing the chances of triggering an effective immune response.In case of HTL epitopes, YRVVVLSFELLHAPA (from S protein, position 508), IGYYRRATRRIRGGD (from N protein, position 84), SPFVMMSAPPAQYEL (from NSP3 protein, position 984) showed the best antigenicity scores of 0.7072, 0.6649, 0.5833, respectively (Table 2).These epitopes not only had the best antigenicity scores, but also showed binding to multiple HLA alleles with high affinity and were also immunogenic (Table 2).In addition, these epitopes are also interleukin-2 inducing epitopes (Supplementary table 9) which increases the chances of these epitopes initiating a strong immune response.However, it is to be noted that despite epitope LKSIAATRGATVVIG having the highest antigenicity score among other HTL epitopes, it was not an interleukin-2 inducing epitope.
Hence, the epitope LKSIAATRGATVVIG was not considered in the list of three best epitope, but for its high immunogenicity and antigenicity scores it was included in the vaccine construct.It is to be mentioned here that the final vaccine candidate was designed from the top ranked CTL (9 amino acid long) and HTL (15 amino acid long) epitopes predicted from the S, N, NSP12 and NSP3 protein (Tables 1 and 2).These epitopes were chosen based on few criteria like antigenicity, immunogenicity, capability to bind with multiple class I and class II MHC alleles and overlapping CTL and HTL epitopes.Once the best epitopes were picked, 5 linear vaccine constructs were generated based on various combinations.From the 5 linear vaccine construct designed, the best construct was chosen based on Z-score, ERRAT and Ramachandran plot analysis (Supplementary Material SM4).The final vaccine construct (which is the proposed vaccine in this study) had a Z score of À 4.71, ERRAT score of 78.3051 and Ramachandran plot analysis showed 98.13% in the favored region and 0% residues in the outlier region and 100% favored rotamers.
The SARS-CoV-2 S protein interacts to its main cellular receptor, angiotensin-converting enzyme 2 (ACE2), to infect most host cells (Zhou et al., 2020).TMPRSS2 (host serine protease) is also another important player that allows S protein's proteolytic priming for receptor association and entry (Hoffmann et al., 2020).Once the viral and host membranes fuse upon S protein binding, the viral genomic RNA is immediately released into the cytoplasm (Hoffmann et al., 2020).Sometimes the virus is also internalized into endosomes, where viral membranes fuse with endosomal membranes after cathepsin-mediated breakage at low pH, allowing nucleocapsid entrance into the cytoplasm (Bayati et al., 2021).Upon making its way to the cytoplasm, the virus is likely to follow the same path as other CoVs, translating into pp1a and pp1ab (polyproteins) (Sawicki et al., 2007, Thiel et al., 2003).These polyproteins are responsible for encoding the non-structural proteins (NSPs), including NSP3 and NSP12 which are again responsible for forming the viral replication-transcription complex (Sawicki et al., 2007, Thiel et al., 2003).A subsequent membrane reorganization from the endoplasmic reticulum (ER) and Golgi occurs to produce double membrane vesicles, compartmentalizing replication, and transcription of the virus (Knoops et al., 2008).Once the synthesis of new viral proteins takes place, they are directed to the ER and Golgi membrane.The nascent viral particles are then formed, after the newly synthesized viral proteins combines with the N protein and genomic RNA, releasing virus into the extracellular space (De Wit et al., 2016;Lee et al., 2020).Innate immune cells such as macrophages, dendritic cells, neutrophils, monocytes, and innate lymphoid cells (ILCs) are equipped with pathogen recognition receptors (PRRs) that recognizes PAMPs, triggering an inflammatory signaling pathways and immune responses (Diamond & Kanneganti, 2022).To date, several PRRs have been demonstrated to activate their signaling pathways in response to SARS-CoV-2 with TLRs being one of the most important amongst all (Diamond & Kanneganti, 2022).Just like many viruses, SARS-CoV-2 also activate the innate immune system through TLR signaling (Diamond & Kanneganti, 2022).Hence, in this study the vaccine construct was docked with two different TLRs (TLR4 and TLR2), to assure that upon interaction the vaccine can elicit the required immune signaling.Two most important adaptor molecules responsible for signal transducing are MyD88 and TRIF (Akira & Takeda, 2004).Upon recognition by the host TLRs, the vaccine candidate is supposed to mount an immune response either through MyD88, TRIF or both (as in case of TLR4; Akira & Takeda, 2004).Once the immune signaling is initiated, nuclear factor (NF)-B, mitogen-activated protein kinases (MAPKs), and interferon (IFN) regulatory factors (IRFs) are all activated downstream of MyD88 (Akira & Takeda, 2004).Translocation of these molecules into the nucleus activates the transcription of several pro-inflammatory cytokines such as tumor necrosis factor (TNF), IL-6 and IL-1.Simultaneously, other innate immune sensor genes like NLRP3 also undergoes transcription along with production of IFNs and IFN simulated genes (Akira & Takeda, 2004).
As far as the adaptive immunity is concerned, the key players that help to tackle the infection are CD4þ T cells, CD8þ T cells and specific antibodies generated in response to the virus (Sette & Crotty, 2021).The designed vaccine comprises of both CD4þ T cell (CTL) and CD8þ T cell (HTL) epitopes.Hence, the vaccine has the capacity to induce adaptive immunity allowing viral clearance and development of long-lasting immunity.The T-cell receptors on CTL cells on recognizing the antigen sends a signal to CD4þ cells, differentiating into multiple distinct cell types (like Th1 cells and Tfh cells), exhibiting a range of helper and effector functions (Sette & Crotty, 2021).The Th1 cells possess the antiviral properties and helps in viral clearance by employing IFN-c and other cytokines into action.The Tfh cells on the other hand helps in B cell activation, necessary for the establishment neutralizing antibody responses, as well as formation of memory B cells and long-term humoral immunity (Crotty, 2019).
The Toll Like Receptors (TLRs) sense pathogen-associated molecular patterns (PAMPs) on a variety of microorganisms and hence serve a critical role in activating innate immunity in human bodies (Kawasaki & Kawai, 2014).Studies have shown that SARS-CoV-2 S protein directly interacts with the TLR4's extracellular domain with a strong binding affinity (Zhao et al., 2021).Not only with TLR4, but also the S protein interacts with TLR2 and induce immune response through the NF-jB pathway (Khan et al., 2021).The NF-jB pathway is also stimulated through TLR4 by the interaction of CTB adjuvant, leading to strong immune response generation (Phongsisay et al., 2015).The top ranked vaccine candidate as discussed earlier was then docked with the immune receptors (TLR4 and TLR2) to check their capacity to bind to the immune receptor, which in turn would increase its chances in generating an immune response.HADDOCK was used for performing the docking of the top ranked compound (in this case the vaccine proposed in the study).The TLR4-vaccine complex showed a HADDOCK score of À 157.9 ± 1.6 and the TLR2-vaccine complex showed HADDOCK score of À 249.5 ± 5.4 (higher negative scores suggest better binding) and both the docked complexes indicated that the vaccine exhibits good binding affinity with both TLR4 and TLR2.A low RMSD score of 0.6 ± 0.3 was observed for both the docked structures implying them to be of a good quality with least deviations.In addition, some other analyses like Van der Walls energy, restrains violation energy and solvent accessible surface area were also calculated and the result has been provided in Tables 6 and 7 of the revised manuscript.The vaccine-TLR4 complex formed 4 disulphide bonds (Table 9), 2 salt bridges and 11 hydrogen bonds (Supplementary material SM2) while the vaccine-TLR2 complex formed 1 disulphide bond (Table 9), 2 salt bridges and 26 hydrogen bonds (Supplementary material SM3) and a detailed list of the interacting residues has been provided in supplementary material SM2 and SM3.Also, the binding energies of À 11.4 and À 21.8 kcal/mol of TLR4 and TLR2 docked complexes, respectively show that the interaction between the vaccine and TLRs is strong and can trigger immune response generation.
Molecular dynamics simulation (MDS) study was performed for the docked complexes to check the stability as well as flexibility of the interacted molecules in certain experimental conditions (temperature, pressure, etc.).The complexes were subjected to 150 ns simulation run after which RMSD and RMSF plots were derived, which are generated based on the activity of the complexes in these experimental conditions.Very less fluctuation in the RMSD graph plot suggests that the complex was stable throughout the simulation time.Similarly, high peaks obtained in the RMSF graph plot indicates that the complex shows flexibility.Next, Immune Simulation study was conducted that evaluated the performance of the vaccine candidate in the generation of a desirable immune response.The generated graphs demonstrate that the proposed vaccine is quite capable of eliciting a significant immune response in a variety of ways like surge in immunoglobulin production, hike in amount of CTL and HTL cells and induction of variety of cytokines.
On comparison of the TLR4-vaccine complex with TLR2vaccine it was found the TLR2-vaccine complex docked better when compared to TL4-vaccine complex (where a greater negative HADDOCK score indicates better docking; Tables 6  and 7).The RMSD score for both the TLR4 and TLR 2 vaccine complex during docking was 0.6 ± 0.3 suggesting that there were very less deviations during docking and both the structures are stable in their docked state.Additionally, both the docked complexes were also simulated at experimental conditions (310 K temperature and 1 bar pressure) using MDS.The MDS results showed that at the desired experimental conditions both the docked complexes behaved quite stably.For the TLR4-vaccine complex there was an increase in the RMSD for the first 50 ns and then the system attained a stable state for the rest simulated time of 150 ns (where there was minimal fluctuation in the RMSD between 1.25 and 1.50 nm; Figure 6).In case of the TLR2-vaccine complex the MDS results suggest that there was an increase in RMSD approximately for the first 50 ns and then the complex attained stability (with a minimal fluctuation in RMSD between 1.0 and 1.5 nm; Figure 6).The RMSF plots for both the complexes showed there were very less fluctuations in the residues which indicates the docked complexes to be quite stable.Similar results were also observed for the radius of gyration plots indicating the compactness of the structure during simulation.All the results discussed above including the binding affinity analysis, hydrogen bond and salt bridge formation (Supplementary material SM2 and SM3) indicated that both the complexes performed well with respect to various criteria of binding.But, when taken together it is the TLR2-vaccine complex that showed comparatively better binding with respect to TLR4-vaccine complex Few research groups have used as similar tools for in silico analysis to build epitope-based vaccinations that have been effectively expressed in mouse models.Bazhan and his colleagues are among such groups who used a similar array of in silico analysis to build an Ebola virus T-cell multi epitope vaccination using the same epitope prediction servers as used in our study (Bazhan et al., 2019).Also, Foroutan and group evaluated the physicochemical properties of their designed vaccine against Toxoplasma gondii by the same tool used by us, and similar results with enhanced immunogenic response were obtained when they assessed the vaccine's properties in mice model (Foroutan et al., 2020).In addition, many research groups have proposed vaccine candidates against SARS-CoV-2 considering multiple proteins of the virus (Rouzbahani et al., 2022 [S and N protein], Adam, 2021 [S, M, N, E and ORF1a protein], Mir et al., 2022 [NSP12 protein], Ong et al., 2020 [S, NSP3 and NSP8 proteins], Safavi et al., 2020 [NSP 7,8,9,10,11,12,14 and S protein]).When compared to the previous body of work in the field, it was found that our vaccine mostly had similar properties with respect to the already proposed candidate vaccines.However, there were few properties which were better for our vaccine candidate and there were few properties which were better for the other vaccine candidates already proposed.Our vaccine candidate had an aliphatic index of 67.34 which was better when compared to the vaccine candidate designed by Rouzbahani et al. but was little lower when compared to few other vaccine candidates (Adam, 2021;Safavi et al., 2020).Our vaccine candidate showed greater stability throughout the molecular dynamics simulation run for a longer time duration of 150 ns when compared to other studies done on the same target (Rouzbahani et al., 2022;Safavi et al., 2020).In addition, the immune response generated by the proposed vaccine in this study performed better than other proposed candidates (Adam, 2021;Mir et al., 2022.Having said so it should also be noted that the quality of the designed vaccine does not solely depend on a single individual entity but is an amalgamation of many different criteria.The designed vaccine performed very well in all the criteria which includes desirable physicochemical properties, structural stability, ability to generate immune response and its ability to bind to immune receptors.The optimistic results obtained in all the in silico analyses do portray it as a suitable vaccine candidate; however in vitro and in vivo studies along with wet lab validations continue to remain a pressing priority. This study highlighted an alternative way to design vaccine candidates apart from the traditional route.The designed vaccine is recommended based only on immunoinformatics-guided evaluations and is believed to be immunogenic with a possibility of generating immune responses.However, the extent of protection that it can give against the virus is still unknown, unless experimental validations are performed.The population coverage of the vaccine candidate was checked using the IEDB server which considers the binding of the peptides with the most frequently occurring HLA alleles in the world population.Though this method has proven to be quite reliable, yet it needs to be tested further.The in silico analyses indicated that the designed vaccine candidate could be effective in generating an immune response however wet lab validations remain to be a necessity.The next planned step for this vaccine would be its in vivo and in vitro validation.The first step would be gene synthesis and cloning followed by purification and expression.The next step would be detection and quantification of immune responses using different assays like Enzyme-Linked ImmunoSpot (ELISpot) and intracellular cytokine staining (ICS) assays (Bazhan et al., 2019) using suitable model organisms.Another step after this would be figuring out the route of administration for the vaccine candidate if it performs well in in vitro and in vivo validations.Having said so, the most important step would be the immune profiling of the candidate vaccine to rule out any repercussions that might pose a threat against the safety and efficacy of the designed vaccine (Li et al., 2014).

Conclusion
The world is in the middle of a COVID-19 outbreak, and scientists in the field of vaccination are hustling to develop a safe and efficient vaccine against SARS-CoV-2.Therefore, various research groups are proposing different vaccine candidates and strategies to safeguard against the infectious SARS-CoV-2.In this study, we designed a chimeric, multi-epitope vaccine against the virus including antigenic and immunogenic epitopes from spike, nucleocapsid, NSP3 and NSP12 proteins.The vaccine contains CTL and HTL epitopes as well as overlapping B cell epitopes and interleukin-2 inducing epitopes.The vaccine was subjected to molecular docking studies with TLR4 and TLR2 to study its interaction.A strong interaction between the immune receptors and vaccine confirms that the vaccine will be triggering immune response in the body.Furthermore, molecular dynamics simulation and immune simulation of the vaccine candidate was conducted to ensure the stability, flexibility and efficacy in immune response generation, respectively.

Figure 1 .
Figure 1.Schematic representation of the immune mechanism pathway in the body which is elicited by the designed multi-epitope vaccine.

Figure 2 .
Figure 2. (A) Schematic representation of the vaccine construct, (B) 3D model of the vaccine, (C) Z-score plot of the vaccine as predicted by ProSA webserver, (D) Ramachandran plot of the vaccine as predicted by Molprobity server, (E) ERRAT score plot as predicted by SAVESv6.0 server.

Figure 3 .
Figure 3. (Ai-Aviii) Docking of individual CTL epitopes with MHC class I.These epitopes have shown successful interaction with MHC class I molecule, which is depicted in pale green color.(Bi-Bviii) Docking of individual HTL epitopes with MHC class II.These epitopes have shown successful interaction with MHC class II molecule, which is depicted in pale blue color.The red, magenta, blue and yellow spheres represent the epitopes present in the vaccine which belong to N, S, NSP12 and NSP3 proteins, respectively.

Figure 4 .
Figure 4. (A) 3D representation of the docked TLR4-vaccine complex.The green color represents the vaccine, and the blue represents the TLR4 structure.(B) Few hydrogen bonds between the docked complex are focused.(C) Interacting residues between chain A (TLR4) and chain B (vaccine).

Figure 5 .
Figure 5. (A) 3D representation of the docked TLR2-vaccine complex.The deep blue color represents the vaccine, and the magenta color represents the TLR2 structure.(B) Few hydrogen bonds between the docked complex are focused.(C) Interacting residues between chain A (TLR2) and chain B (vaccine).

Figure 7 .
Figure 7. (A) (i) Shows the antibodies responses against antigen, (ii) Shows the cytokine concentration during the whole simulated period evidencing the reaction to the vaccine, (iii) Shows the corresponding count of antibody generating B cell population, (iv) Shows activity of T helper cells, (v) Shows activity of Cytotoxic T cells, (vi) Plasma B cell population during simulation period against the vaccine for the proposed vaccine candidate (B) (i) Shows the antibodies responses against antigen, (ii) Shows the cytokine concentration during the whole simulated period evidencing the reaction to the vaccine, (iii) Shows the corresponding count of antibody generating B cell population, (iv) Shows activity of T helper cells, (v) Shows activity of Cytotoxic T cells, (vi) Plasma B cell population during simulation period against the vaccine for the proposed vaccine candidate for the positive control C5.

Figure 8 .
Figure 8.The black color part represents the pET28a (þ) expression vector in which the codon optimized multi-epitope vaccine is inserted (red color part).

Table 1 .
List of all selected CTL epitopes as predicted using NetCTL 1.2.

Table 2 .
List of all selected HTL epitopes as predicted using NetMHC II pan 3.2 server.

Table 3 .
As anticipated by the IEDB server, population coverage of the specified epitopes of the designed vaccine candidate.

Table 4 .
Linear/continuous B cell epitopes predicted by ElliPro server.

Table 8 .
The binding energies of the docked complexes of vaccine with TLR4 and TLR2, as predicted by PRODIGY webserver are enlisted.

Table 9 .
List of di-sulphide bonds present in the docked complexes of vaccine with TLR2 and TLR4 as predicted by Design 2.0 server.