In silico based multi-epitope vaccine design against norovirus

Abstract Norovirus (NoV) belongs to the Calciviridae family that causes diarrhoea, vomiting, and stomach pain in people who have acute gastroenteritis (AGE). Identifying multi-epitope dependent vaccines for single stranded positive sense viruses such as NoV has been a long due. Although efforts have been in place to look into the candidate epitopes, understanding molecular mimicry and finding new epitopes for inducing immune responses against the T/B-cells which play an important role for the cell-mediated and humoral immunity was not dealt with in great detail. The current study focuses on identifying new epitopes from various databases that were filtered for antigenicity, allergenicity, and toxicity. The adjuvant β-defensin along with different linkers were used for vaccine construction. Further, the binding relationship between the vaccine construct and toll-like immune receptor (TLR3) complex was determined using a molecular docking analysis, followed by molecular dynamics simulation of 100 ns. The vaccine candidate developed expresses good solubility with a score of 0.530, Z-score of –4.39 and molecular docking score of –140.4 ± 12.1. The MD trajectories reveal that there is a stability between TLR3 and the developed vaccine candidate with an average of 0.91 nm RMSD value and also the system highest occupancy H-bond formed between GLU127 of TLR3 and TYR10 of vaccine candidate (61.55%). Four more H-bonds exist with an occupancy of more than 32% between TLR3 and the vaccine candidates which makes it stable. Thus, the multi-epitope based vaccine developed in the present study forms the basis for further experimental investigations to develop a potentially good vaccine against NoV. Communicated by Ramaswamy H. Sarma


Introduction
Norovirus (NoV), which is the main source of acute gastroenteritis (AGE) in kids and adults (Hall et al., 2013;Vega et al., 2014), affects over 699 million people and causes over 219,000 fatalities every year (Bartsch et al., 2016). NoV, also known as the winter vomiting bug, belongs to the Caliciviridae family, which are non-enveloped, single-stranded positive-sense RNA viruses. The infections are extremely contagious, communicated through oral/fecal course by individual-to-individual contact, and regularly cause huge breakouts in closed environments through contaminated water and food. While the illness is frequently self-restricted, expanding the study of disease transmission information which involves virus-causing infection among a group of people recommends that NoVs can cause severe diarrhea, especially in the elderly.
The World Health Organization (WHO) reported in 2016, that the development of a NoV vaccine should be regarded as a top priority (Esposito & Principi, 2020). Several NoV vaccine candidates have been developed (Mattison et al., 2018) with a subset of these vaccines having undergone advanced clinical trials or extensive preclinical testing, but to date, there has been no vaccine or antiviral drug currently available for human norovirus (Lucero et al., 2018). The genogroups (G) of NoV strains that cause infection in humans are GI, GII, and GIV. GI genotype is responsible for just 10% of NoV illness, especially in the United States, while the GIV viruses are rarely reported (Eden et al., 2012). However, the GII strains are responsible for nearly 90% illnesses (Hallowell et al., 2019) and therefore were selected for this study.
Adaptive immunity, which is expressed in lymphocytes by T and B cells, play an important role for the cell-mediated and humoral immunity that can identify the virus (Sanchez-Trincado et al., 2017). Therefore, the design of vaccines works best by boosting the immune response of the body against viruses. The epitope-based vaccine design employs bioinformatics analytic tools which facilitates faster development of vaccines against the virus (Fleri et al., 2017). These multi-epitope-based vaccines, which are made up of antigens (epitopes), peptide linkers, and adjuvants rather than complete viral proteins, contribute to enhanced protection by generating distinct immune responses with minimal adverse effects besides being cost and time effective (Parvizpour et al., 2020). Antibodies are produced in response to the antigens, whereas interferons (IFN) are produced by host cells that are affected by virus proteins. The present study focuses on the design of a vaccine target using in silico approaches with substantial focus on validating the candidates (Fleri et al., 2017;Khairkhah et al., 2020;Rawal et al., 2021). The predicted T-cell and B-cell epitopes were screened for antigenicity, toxicity, allergenicity and immunogenicity, to construct a multi-epitope vaccine candidate that can then be validated using molecular dynamics simulation against the TLR3 receptor (PDB ID: 1ZIW)to find a stable vaccine against NoV.

Materials and methods
The computational tools and web servers that are considered in the present study are detailed below:

Protein sequence retrieval and determination of conserved region
The NoV protein sequences were extracted from the ViPR (Pickett et al., 2012) [BOX 01] database using the keyword "norovirus". The capsid protein (Reference no. BAD14534.1, UniProtKB: Q76AL3) of type GII strains was also retrieved from this database and saved in fasta format. The conserved region was determined by the preliminary analysis of NoV protein sequences using Basic Local Alignment Search Tool (BLASTp) (Johnson et al., 2008) [BOX 02] using Blosum62 matrix, non-redundant protein sequence and protein-protein database. A query coverage > 96% and percentage identity >96% were selected for further study.

Prediction of antigens and T lymphocyte (Tcell) epitopes
The Vaxijen V2.0 (Doytchinova & Flower, 2007a, 2007bDoytchinova & Flower, 2008) [BOX 03] web server, which is the first server to predict protective antigens without regard to alignment, was used to predict protective antigens from the conserved protein sequences of NoV. It enables antigen classification based solely on physicochemical properties of proteins rather than sequence alignment.The T-cell epitopes were predicted using NetCTL 1.2 server (Larsen et al., 2005(Larsen et al., , 2007 [BOX 04] for the selected protein sequences from VaxiJen. It combines the predictions of peptide MHC class I binding, proteasomal C terminal cleavage, and transporter associated with antigen processing (TAP) transport efficiency. Using 12 MHC class I supertypes, the epitopes of cytotoxic T lymphocytes (CTL) were predicted.

Filtering toxin, non-antigenic, allergic epitopes
The ToxinPred (Open Source Drug Discovery Consortium, 2013) [BOX 05,12] web server was used to predict the toxicity of the T-cell epitopes. Here, a quantitative matrix (QM), dipeptide-based model was used for predicting toxicity, as its accuracy was 94.50% and the results are more biologically relevant. The non-toxic epitopes were re-tested for antigenicity by VaxiJen v2.0 [BOX 06], and those above the threshold 0.5 value were selected. AllerTop v2.0 (Dimitrov et al., 2014) [BOX 07,13] was used to sieve out the allergens among the epitope set. It uses an auto cross covariance transformation (Guo et al., 2008) of amino acids based on various calculable properties, to perform a KNN classification using a training dataset comprising validated allergens and non-allergens.

Selection of conserved and higher immunogenic epitopes
The epitopes which are highly conserved were selected from IEDB (Vita et al., 2019) Epitope Conversency analysis tool (Bui et al., 2007) [BOX 08] and the epitopes with higher score which has greater probability of eliciting an immune response was selected as final T-cell epitopes using IEDB-CII (Class I immunogenicity) (Calis et al., 2013) [BOX 09].

Prediction of HTL and B-cell epitope
For adaptive immunity, the helper T lymphocyte (HTL) epitopes are important, as they help to enable B cells to produce antibodies (Alberts, 2017). These are also called CD4þ cells and can be activated by binding to major histocompatibility complex(MHC) class II molecules (Chisholm, 2018)    predicts and ranks the potential IFN-c inducing epitopes based on MHC class II binders in a given list of epitopes. The helper T cells were predicted using the IEDB MHC II binding tool using the IEDB_recommended 2.22 method which served as the consensus combination of SMM, NN, NetMHCIIPan. The HLA molecules below percentile rank 10 were considered for IFN-c epitope prediction. All positive MHC II binding epitopes were selected from the IFN epitope prediction method, which were further validated for toxicity, antigenicity and allergenicity using ToxinPred, Vaxijen V2.0 and Allertop v2.0, respectively.

Predicting linear B-cell epitopes
The goal of B-cell epitope prediction is to generate antibodylike linear B-cell epitopes, made up of successive epitopes. The BcePred (Saha & Raghava, 2004) [BOX 14] predicts linear B-cell epitopes employing physico-chemical characteristics of epitopes from the database. 1029 B-cell epitopes from the Bcipep database and an equal number of non-epitopes from the Swiss-Prot database are included in the database collection. The overlapping epitopes from both HTL and linear Bcell were considered as final B-cell epitopes.

Design of multi-epitope vaccine construct
A multi-epitope vaccination was created employing T-Cell epitopes and linear B-Cell epitopes. A 45-mer-defensin peptide adjuvant (Uniprot KB-Q5U7J2) was inserted at different locations in the N-terminus in order to improve immune response. The EAAAK linker was used to connect the adjuvant to T-cell and also at the C-terminus. The GGGGS linker was used to connect two T-Cells, while the GGGS linker was used to connect B-Cell epitopes. These linkers guarantee that epitopes are presented effectively in the body and that immunity is maximized, in addition to ensuring effective separation of individual epitopes (Khan et al., 2021;Chen et al., 2013).The constructed multi epitope vaccine was verified for toxicity, allergenicity and antigenicity using [BOX 15] ToxinPred, Allertop V2.0 and VaxiJen V2.0 respectively.

Structure modeling and molecular docking
The I-Tasser web server (Roy et al., 2010; [BOX 16] was used for three-dimensional (3D) structure modeling of the constructed vaccine and the resultant models were further assessed and verified. The Ramachandran plot was analyzed for the 3D structure using Chimera (Pettersen et al., 2004). The PSIPRED web server (Jones, 1999) was used for secondary structure analysis of the constructed vaccine. ERRAT (Colovos & Yeates, 1993) server and Verify 3D (Eisenberg et al., 1997) analysis were performed for further structure validation. The ProSA (Wiederstein & Sippl, 2007) [BOX 17] web server was used for the refinement and validation of experimental protein structures analysis.

Molecular dynamics simulation
The docked vaccine construct and receptor (TLR3) complex with the lowest energy was considered for molecular dynamic

Molecular simulation and analysis of vaccine construct
The C-ImmSim (Rapin et al., 2010) [BOX 18] is used to validate the vaccine construct's immune response profile, which is an in silico approach that describes both cellular and humoral responses of a mammalian immune system. The vaccine profile was examined for a single injection using default simulation parameters with a phase of 1000 runs and a vaccine that does not contain LPS (Lipopolysaccharide).The vaccine construct was analysed for physico-chemical  properties through ProtParam (The Proteomics Protocols Handbook, 2005) and Prosol (Gasteiger et al., 2005) was used for solubility. Figures 1 and 2 depicts the workflow for predicting a multiepitope dependent NoV vaccine. It represents the computational tools/software used to accomplish the task. Each box has three parts, the top part is the sequence of occurrences of the task done, the middle part describes the function and the bottom is the name of the resource used.

Results
A capsid protein (Reference no. BAD14534.1, UniProtKB: Q76AL3) of type GII NoV strains (Parra, 2019) retrieved from VipR database was used to identify the conserved NoV protein sequences using BLASTp. A total of 45 NoV protein sequences with identity percentage greater than 96% consisting of human NoV strains were downloaded. The protective antigenic protein of the NoV was predicted from VaxiJen V2.0 by using a threshold value 0.5 for higher accuracy and higher specificity. The uniprotkb id Q76AL3 has the highest antigenic protein score of 0.6648 and all NoV protein sequences demonstrated protective antigenic properties. T-Cell epitopes were predicted based on the combined score for all 12 MHC class I supertypes, with a threshold value 0.75 for higher accuracy, with 80% sensitivity and 97% specificity. A total of 81 T-cell 9-mer epitopes were predicted by the NetCTL 1.2 server, which were then subjected to various filters and 64 epitopes were predicted to be non-toxic by selecting quantitative matrix (QM), dipeptide method. The non-toxic epitopes were then cross-validated for antigenicity and 41 epitopes were found to be above the threshold of 0.5, of which a total of 15 epitopes were found to be nonallergenic. The final T-Cell epitopes were chosen based on their immunogenicity and conservancy. The epitopes exhibiting the lowest IC50 values (<750 nM) were considered for HLA allele selection. The VaxiJen score, immunogenicity score and predicted NetCTL E-score for the corresponding epitopes are described in Table 1. The final overlapped HTL and B-Cell epitopes, along with HLA allele, which were predicted using IEDB MHC II binding based on the epitope ranked between 2.2 and 9.5 along with its vaxijen scores are shown in Table  1 (Singh et al., 2013).
Eight short read sequences of T-cell and six short read sequences of B-cell epitopes were screened with various permutations and combinations (Chen et al., 2013;Ertl et al., 1991). Five vaccine sequences were constructed using epitopes with high potential linear scores, selected using random forest algorithms (V1-5) . The residues exhibited strong solubility (>0.5), as well as physicochemical properties such as aliphatic stability >75% and Grand average of hydropathicity index (GRAVY) between -0.1 and -0.2, indicating that the candidates are hydrophilic in nature. The vaccine candidates also exhibited a half-life of 30 hrs indicating higher efficiency (Table 2; supplementary material Figure 1).

Molecular docking
The modeled structures of constructed vaccines (V1-5) were targeted to the human TLR3 receptor protein and the molecular interaction studies resulted in best pose with lowest energy scores ranging from -122.6 ± 31.1 to -140.4 ± 12.1 (Figure 4). It was found that V2 requires the lowest energy to interact with TLR3 receptor protein (Table 4). Both 2D and 3D visualizations were used to identify the interacting residues as shown in Figure 5. Our observations correlate with the in vitro studies on TLR3 receptors residues by Sun et al. that demonstrate Arg252, Asn413, Arg635, Asn361, Asn457, Arg251, Asn380, Arg325, Asp242 and Arg222 to be involved in the N-glycosylation pathway, resulting in upregulation of the immune system (Sun et al., 2006).

Molecular simulations
The (MD) simulation was performed on V2 (vaccine_construct-TLR3 complex) and RMSD value of 0.91 nm was maintained in the system as shown in Figure 6(a). An average RMSD of 0.91 nm was observed throughout the system. The hydrogen bonds in the simulation of the complex were spread across range 2 to 7 indicating strong binding of the vaccine_construct-TLR3 complex as represented in Figure  6(b). Though the maximum number of H-bonds Figure 6(c) observed was 9, the number of stabilized H-bonds on an average was 3. The highest occupancy (61.55%) H-bond  Figure 7. The results of C-ImmSim server prediction of immune response after administering vaccine construct.
system was formed between GLU127 of TLR3 and TYR10 of the vaccine construct. Other H-bond occupancies of TLR3 and Vaccine construct, which have high bond occupancy as shown in (Table 5).
The results clearly indicate that the novel multi-epitope vaccine designed in the current study, is one of the most promising candidates for NoV. The C-ImmSim simulation results show that from the 5th day onwards, there was an increase in IgM and IgG, along with a decrease in concentration of antigens (Figure 7(a)). Additionally, there was also an increase in the B-Cell population which resulted in the increased expression of immunoglobulins and increase in the B-Cell memory development (Figure 7(b)). Furthermore, a consistent increase in the TH Memory cells was also observed from the 5th day onwards (Figure 7(c)). A significant stimulation in IFN-c development was also observed after immunization (Figure 7(d)) (Za˛bczy nska & Poche c, 2015). Our findings also indicate that the T-cell and B-cell populations were extremely sensitive in the production of memory cells, whereas the rest of the immune cell population was consistent.

Discussion
The design of a multi-epitope vaccine for Norovirus (NoV) which causes acute gastroenteritis (AGE) in kids and adults is one of the major challenges in computational drug design. The World Health Organization (WHO) also expressed that the development of a NoV vaccine should be regarded as a top priority. The purpose of this study is to predict a potential multi-epitope vaccine candidate for NoV. The in-silico approaches using bioinformatics tools help to provide a potential candidate for NoV (Abraham Peele et al., 2021;Kolla et al., 2021;Rawal et al., 2021), which can bind to TLR3 receptors.
The 45 Fasta format sequences of GII human NoV used in this study focuses on conserved NoV types variants. The new epitopes, which were identified from various databases, are filtered for antigenicity, allergenicity, and toxicity. The adjuvant b-defensin along with different linkers were used for vaccine construction to make the vaccine more efficient. Further, the binding relationship between the vaccine construct and TLR3 complex was determined using a molecular docking analysis, followed by molecular dynamics simulation of 100 ns. The RMSD analysis of the complex along with the radius of gyration and RMSF analysis revealed that the vaccine constructed is stable.
Our observations clearly indicate that the novel multi-epitope vaccine designed in the current study is one of the most promising candidates for NoV with the increase in the B-Cell population along with a consistent increase in the TH memory cells coupled with significant stimulation in IFN-c development. Taken together, our observations clearly demonstrate that the novel epitopes designed in this study can be used as templates for further studies in vitro to design vaccines against NoV.

Conclusion
The current in silico study was focused on the design of multiepitope vaccine candidates against NoV. An integrated bioinformatics approach using T-cell and B-cell epitope databases was utilized, to screen bonafide epitopes with adjuvant linkers, for eliciting a strong immune response. The structure modeling, molecular interaction and MD simulation revealed that the complex formed by TLR3 receptor with the designed vaccine candidates are very stable with good H-bond occupancy. A significant proportion of amino acid residues of the vaccine candidates were found to be involved in N-glycosylation and therefore, upregulation of the immune system. Furthermore, the increase in the B-Cell population along with a consistent increase in the TH memory cells coupled with significant stimulation in IFN-c development, clearly indicate that the vaccine candidates also elicited a good immune response.

Disclosure statement
There is no conflict of interest.

Funding
The author(s) reported there is no funding associated with the work featured in this article.