Design of a multi-epitope Zika virus vaccine candidate – an in-silico study

Abstract Zika virus (ZIKV), an RNA virus, rapidly spreads Aedes mosquito-borne sickness. Currently, there are neither effective vaccines nor therapeutics available to prevent or treat ZIKV infection. In this study, to address these unmet medical needs, we aimed to design B- and T-cell candidate multi-epitope-based subunit against ZIKV using an in silico approach. In this study we applied immunoinformatics, molecular docking, and dynamic simulation assessments targeting the most immunogenic proteins; the capsid (C), envelope (E) proteins and the non-stuctural protein (NS1), described in our previous study, and which predicted immunodominant B and T cell epitopes. The final non-allergenic and highly antigenic multi-epitope was constituted of immunogenic screened-epitopes (3 CTL and 3 HTL) and the β-defensin as an adjuvant that have been linked using EAAAK, AAY, and GPGPG linkers, respectively. The final construct containing 143 amino acids was characterized for its allergenicity, antigenicity, and physiochemical properties; and found to be safe and immunogenic with a good prediction of solubility. The existence of IFN-γ epitopes asserts the capacity to trigger strong immune responses. Subsequently, the molecular docking among vaccine and immune receptors (TLR2/TLR4) was revealed with a good binding affinity with and stable molecular interactions. Molecular dynamics simulation confirmed the stability of the complexes. Finally, the construct was subjected to in silico cloning demonstrating the efficiently of its expression in E.coli. However, this study needs the experimental validation to demonstrate vaccine safety and efficacy. Communicated by Ramaswamy H. Sarma


Introduction
In 2015, the Zika virus (ZIKV) caused an epidemic in Brazil, resulting in an estimated 200,000 identified cases (Poland et al., 2019), and initiating a spectrum of congenital diseases, including microcephaly in newborns and Guillain-Barr e syndrome (GBS) in adults (Prasasty et al., 2019). ZIKV transmission usually occurs by infected 'Aedes mosquitoes' bites; it can be even by laboratory contamination, sexual transmission, maternofetal transmission, and transfusion-transmitted infections (Musso & Gubler, 2016). Therefore, prevention measures used to control the ZIKV infection are controlling the vectors, reducing sexual transmission, and wearing clothes that cover as much of the body as possible. Currently, there are neither effective vaccines nor therapeutics available to prevent or treat ZIKV disease (Russo et al., 2017).
The ZIKV (family Flaviviridae, genus Flavivirus) is a positive sense, single-strand ribonucleic acid (RNA) virus with a genome size of 10,794 kilobases (Ezzemani et al., 2021a). The RNA is translated into a long polyprotein, encoding three structural proteins (capsid (C), pre-membrane (M), and envelope protein (E)) that contribute to the structural organization of ZIKV as well as seven non-structural proteins (NS1, NS2A, NS2B, NS3, NS4A, NS4B, and NS5) by playing an essential role in the virus replication (Sirohi & Kuhn, 2017). Among these non-structural proteins, NS1, with its two glycosylation sites, associates with lipids inside cells by forming homodimers and a hexameric lipoprotein particle in the extracellular space. Thereby, NS1 is involved in immune invasion and pathogenesis by interacting with host immune components (Lin et al., 2018). On the other hand, The E-protein is a major component involved in receptor binding, membrane fusion and host immune recognition (Lin et al., 2018).
To design an effective subunit vaccine candidate against different ZIKV strains, multiple antigenic epitopes should be selected, followed by the addition of an appropriate adjuvant to enhance the host immune response. For this purpose, we investigated a new approach named « multi-epitope-based subunit vaccines » by predicting in our previous study (Ezzemani et al., 2021b) the possible humoral (B-cell) and cell-mediated immunity (cytotoxic T-cell lymphocyte (CTL) epitopes to MHC I alleles and helper T lymphocyte (HTL) epitopes to MHC II alleles). The cell-mediated immune response plays a major role in vaccine therapeutic applications by eliminating chronically infected cells from the body (Patronov & Doytchinova, 2013). Hence, this study uses structural proteins (capsid and envelope) and NS1 to formulate the multi-epitope vaccine. These proteins have a potential in vaccine development (Carpio & Barrett, 2021;Li et al., 2019;Salvador et al., 2019). The vaccine build up requires an appropriate adjuvant and linkers. The design of the multiepitope vaccine was then performed to determine the primary and tertiary structure, followed by refinement and validation. The physicochemical, solubility, allergenicity and antigenicity properties of the vaccine candidate were also investigated using immuno-informatics tools. The vaccine consists of epitopes that have a high ability to stimulate cellular immune responses and effectively induce Interferongamma (IFN-c) production. Subsequently, the binding affinity between the vaccine protein and Toll-like receptors (TLR2/ TLR4) was determined by molecular docking. Furthermore, a molecular dynamics (MD) simulation was performed. Finally, an in silico cloning was completed to show the expression efficiency of the vaccine in E.coli system.

Construction of multi-epitope vaccine sequence
High-priority T and B-cell epitopes were selected from the previous study based on a lower percentile and an IC50 binding affinity threshold of less than 50 nM; a range that classified the peptides to have a high affinity with MHC. Then the ability to trigger an immune response was assessed based on the immunogenicity and antigenicity scores. These epitopes were included in the vaccine design. The CTL epitopes were joined with AAY linker peptides, and the HTL epitopes were joined using a GPGPG linker. An adjuvant b-defensin was linked at the N-terminal with the EAAAK linker to boost the immunogenicity of the vaccine.

Ifn-c-inducing epitopes prediction
The presence of Interferon-c (IFN-c) inducing epitopes in the multi-epitope vaccine sequence was predicted by IFN epitope server (http://crdd.osdd.net/raghava/ifnepitope) using the scan module. The motif and support vector machine (SVM) hybrid model was selected. This approach combines strength of both techniques motif-based and SVM to discover motifs or patterns in IFN-c inducing and non-inducing MHC binders. Prediction score !1.0 was set as a threshold value for good epitopes inducing IFN-c.

Molecular dynamics (MD) simulation
The MD simulations were operated by iMODS online server (http://imods.chaconlab.org/) employing the normal mode analysis (NMA). This tool helps to visualize and analyze the structural dynamics of the docking complexes as well as determine the functional molecular motions. Using the default parameters, this web server generates complex deformability, B-factor, eigenvalues, variance, covariance matrix and elastic network (L opez-Blanco et al., 2014).

Codon optimization and in silico cloning
Codon optimization was performed using Codon Adaptation Tool (JCat) (http://www.jcat.de/) server to ensure that the vaccine candidate provides a feasible plasmid construction based on E. coli expression strain K12. The output consists of the cDNA sequence generated and two other parameters; percentage of GC-content of the improved sequence and Codon Adaptation Index (CAI) score. In addition, the optimized multi-epitope vaccine sequence was inserted into an expression vector, pET-28a (þ) by SnapGene software. Then, the HindIII and BamHI restriction sites were added in the DNA sequence of the vaccine, to ensure a suitable epitopes insertion into the plasmid.

Selection of epitopestoxicityconstruction of multi-epitope vaccine sequence
The vaccine was generated using bioinformatics tools and consisted of 143 amino acid residues derived from the adjuvant and the predicted CTL ( 46 MVLAILAFLR 55 , 321 ETLHGTVTV 329 , and 131 TASGRVIEEW 140 ) and HTL ( 80 IIKKFKKDLAAMLRI 94 , 371 ENSKMMLELDPPFGD 385 , and 156 DGCWYGMEIRPRKEP 170 ) epitopes based on high binding affinity score, non-toxicity, and immunogenicity, which were placed in the vaccine construct by different linkers (Figure 1).

3 D modeling, validation, and evaluation of physicochemical, allergenicity, and antigenicity properties
After the construction of our multi-epitope vaccine, the tertiary structure was built by trRosetta ( Figure 2). Furthermore, the 3 D model was validated, by Ramachandran plot, showing 91.5% of residues in the favored region, 6.8% residue was placed in the allowed regions and 1.7% in outlier regions ( Figure S.1A). Upon the quality check process, the analysis showed that the ERRAT and Z-scores were 82.963, À4.49, respectively (Figure S.1B and C).
Using VERIFY3D, protein model is suggested to be of good quality if at least 80% of the residues have averaged 3 D-1D score ! 0.2. In our case, 100.00% of the amino acids of the protein score ! 0.2 in the 3 D/1D profile (Figure 3).
The multi-epitope vaccine construct has a molecular weight of 15768.58 Daltons, with a Theoretical pI of 9.51, and an instability index of 37.29 classified as stable. The Figure 1. Schematic representation of the multi-epitope vaccine construct. A multi-epitope vaccine sequence, consisting of 143 amino acids residues has been constructed with CTL and HTL, is depicted in pink and orange boxes, respectively. These epitopes have been assembled together by AAY linkers and GPGPG linkers. EAAAK linker (purple) was used for linking the adjuvant at N-terminal.
physicochemical properties of the designed multi-epitope are presented in the Table 1.
Regarding solubility, SOLpro server provided good prediction (0.5821) upon overexpression in E. coli. The allergenicity and antigenicity of the vaccine final sequence, including the adjuvant, were predicted using the AlgPred 2.0, VaxiJen v2.0 servers to be À0.77446766 and 0.4578, respectively. Hence, the results suggested that the multi-epitope vaccine was non-allergenic and highly antigenic.

IFN-c inducing epitopes
The IFN epitope server has allowed us to predict IFN-c inducing ability of the antigenic regions of the multi-epitope vaccine sequence (Table 2).
According to the results, the numbers of hydrogen bonds (H-bonds) between the best-docked complexes were: 3 (vaccine construct/TLR2), and 5 (vaccine/TLR4). Then, the vaccine formed several interactions with each complex (The number of interacting interface residues, the quantification of the interface area, the number of salt bridges, and non-bonded contacts) as shown in the Table 3.
In order to confirm the stability of the vaccine candidate with TLR2/TLR4 receptors as well as the overall mobility analysis, the molecular dynamics simulation was performed using iMODS server. This web server generates the domains mobility, which is, represented by two colored affine-arrows and the black arrows indicating that the two complexes   vaccine -TLR2/TLR4 complexes were directed towards each other ( Figures 5A and 6A). Then, the deformability of both complexes was observed based on the localization of hinges, which are regions typically with high deformability. The complex vaccine -TLR4 has lesser deformability compared to the second complex ( Figures 5B-6B). The B-factor is shown in Figures 5C-6C representing the stable structure of docked molecules. The values were calculated by NMA analysis. Subsequently, the server calculated eigenvalues as 1.523596e À05 , and 1.879035e À05 of the vaccine -TLR2/TLR4 complexes, respectively, as represented in Figures 5D-6D. The variance plot is represented with red (individual variances) and green (cumulative variances) bars ( Figures 5E-6E). Moreover, eigenvalue and the variance are inversely correlated. The residues of the vaccine -TLR2/TLR4 complexes were assessed by the covariance matrix representing the correlation with colors graph (Red -Correlated motions), (Blue -Anti-correlated motions), and (White -Uncorrelated motions) ( Figures 5F-6F). Furthermore, the Figures 5G-6 G presents the elastic network model where each spring represents the corresponding pair of atoms. The dark gray springs indicate the rigidity of the complexes.

Immune simulation
To investigate the immune responses of the final vaccine construct, the C-ImmSim server exhibited the real responses. The primary response was marked by a significant increase of IgM. Then, the secondary and tertiary responses were significantly higher in comparison with the primary response, and represented decreased concentration of antigen with high levels of activity of IgG1, IgG1 þ IgG2, IgM, and IgMþ IgG as shown in Figure 7A. The results also showed that high B-cell population was characterized during secondary and tertiary responses ( Figure 7B). Relatively, high levels of the plasma B cells were recorded in the secondary and tertiary responses producing a good increase in IgM and IgG1, which depicts active B-cell proliferation and the presence of memory B cells ( Figure 7C). Furthermore, there was also an increase in TH memory cells with a higher level of active state TH cell during the secondary and tertiary responses ( Figure 7D and E). Subsequently, The TC cells elevated to a maximum of greater than 1150 cell per mm 3 and then the level fluctuated ( Figure 7F). However, the Figure  7G presents the fluctuation of the NK cell population during 365 days. The average level was 350 cells per mm 3 . Then, the fluctuation of the Macrophage population, was shown in the Figure 7H indicating a significant level of active macrophages with the first doses. Finally, the IFN-c and IL-2 were reported in the first days after the introduction of the vaccine reaching a high level ( Figure 7I).

Codon optimization and in silico cloning
The generated cDNA sequence after codon optimization was 429 nucleotides long. Our vaccine sequence had an average   GC content of 52.21%, which remains in the optimal range (30 -70%), while the results showed that the CAI value was 1.0, indicating a good protein expression in E. coli. Finally, HindIII and BamHI restriction sites were added at the beginning and end of the sequence (Figure 8).

Discussion
ZIKV outbreak demonstrates a pandemic threat to global public health (Song et al., 2017). The virus can be transmitted by several ways (Plourde & Bloch, 2016) and the best way to prevent ZIKV occurrence is to get a vaccine. Unfortunately, no vaccine is  yet available for the prevention or treatment of the virus (Ezzemani et al., 2021a). Reverse vaccinology has been widely used to determine epitopes for developing multi-epitope vaccines, taking advantage of the pathogen's genome sequence (Rappuoli, 2000). In our previous paper (Ezzemani et al., 2021a), the B-Cell and T-epitopes of Capsid, Envelope, NS5 RdRp, NS3 Protease, and NS1 proteins were predicted using several immuno-informatics methods. Several studies have shown that T cell responses are ideal for vaccination against infection. Indeed, vaccines based on T cell epitopes have been explored in recent years due to their ability to target conserved epitopes and hence providing long-term protection against different virus strains (Dawood et al., 2019;Liu et al., 2011). In this study, a multi-epitope vaccine candidate was constructed using T-cell epitopes (CTL and HTL epitopes) of three proteins: Capsid Protein -Envelope protein -NS1 based on a high affinity and immunogenicity and also proven to be non-toxic. These proteins play a crucial role in the replication cycle and the particle structure of virus (Ezzemani et al., 2021a), as well as the presence of an adjuvant « b-defensin, » which consists of a strong immunostimulatory ability (potential) to effectively activate both innate and adaptive immune responses (Mohan et al., 2013). The elements mentioned above were fused using the EAAAK, AAY, and GPGPG linkers to design the vaccine construct. Linkers play a dual role: Increased efficiency of epitope presentation and the necessary separation between them (Arai et al., 2001). Then, the multi-epitope vaccine candidate was subjected to bioinformatics evaluations. The primary amino acid sequence of the multi-epitope vaccine was submitted to model its 3 D structure and validate it using the Ramachandran plot by determining the stereochemical quality of the protein model (Gopalakrishnan et al., 2007), and its high-quality 3 D structure was then confirmed by the ERRAT and z-scores for further analysis. Subsequently, the selected model appears to be of good quality, as inferred of 3 D-1D score estimated by VERIFY3D. In addition, the protein vaccine was antigenic, with a good prediction of solubility and did not cause any allergic response. The molecular weight of the vaccine was 15768.58 Da and the theoretical pI of the vaccine was calculated to be 9.51. The instability index was evaluated to be 37.29, thus highlighting vaccine stability. The aliphatic index of 77.20 indicated that the protein is thermostable (Ikai, 1980). The GRAVY index of the vaccine was À0.271, suggesting high solubility (lower the GRAVY score, better is the solubility) (Kyte & Doolittle, 1982). Therefore, the presence of IFN-c epitopes in the vaccine has the capacity to trigger strong immune responses. The designed vaccine was then docked with TLR2 and TLR4 indicating the possibility of an innate immune response with a crucial role in maintaining the balance of TH1 and TH2 responses (Mukherjee et al., 2016). The molecular docking analysis identified a high affinity among the multi-epitope vaccine candidate and TLR2/TLR4 receptors. Hence, a higher cluster size and lower interaction energy scores represent an effective protein interaction. The binding stability for both complexes, were carried out using iMODS server, indicating a stable binding and good flexibility (L opez- Blanco et al., 2014). Subsequently, the immune response simulation was conducted using the C-ImmSim server to understand the cellular and humoral responses. The immune simulation showed an increase in IgM after the first days of vaccination, which is a normal immune response phenomenon in vaccination (Kazi et al., 2018;Toman et al., 2019). A similar approach was used by Negahdaripour et al., where they have designed a multi epitope vaccine against Human papillomavirus (HPV) (Negahdaripour et al., 2017). Several studies have used similar types of multi epitope vaccine design strategies against MERS (Srivastava et al., 2018), Ebola (Bazhan et al., 2019), SARS-CoV-2 (Bhatnager et al., 2021). Thus, a multi-epitope vaccine constructed cautiously utilizing such a methodology could become an integral asset to battle tumors and viral contaminations (L. Zhang, 2018). However, cloning into pET-28a (þ) vector was performed to confirm the vaccine expression and potency. The vector was chosen for its heterologous cloning and expression in E. coli. Finally, the designed vaccine requires in vitro and in vivo experimental analyses to determine its safety and efficacy in preventing ZIKV infections.

Conclusion
Immunoinformatics can effectively leverage computational techniques to deliver effective and utilitarian advantages in the search for new vaccines. Scanning of genome-specific proteins to identify immunogenic epitopes leads to an elicited immune response without any reversal of viral pathogenesis. Therefore, our in silico study uses this approach to identify a vaccine candidate tackling ZIKV infection.

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
This study was supported by Institut Pasteur du Maroc.