Production and Characterization of Two Specific ZIKV Antigens Based on Bioinformatic Analysis and Serological Screening

ABSTRACT Background The high structural similarity between the Zika virus (ZIKV) and other flaviviruses, such as Dengue Virus (DENV), complicates the identification of the infecting virus due to the occurrence of cross-reactions in serological assays. This phenomenon has increased the demand for more specific antigens for immunodiagnostic applications. Methods The present work aimed to identify specific regions of ZIKV and produce unique antigens through computational methods, molecular and microbiological techniques. Results Based on the computational analysis we successfully expressed two recombinant proteins derived from specific regions of the ZIKV. Through serological assays using characterized sera, we observed that the region 146–182 of ZIKV’s E protein, expressed in tandem, was not reactive despite the predictive sensitivity and specificity observed by computer analyses. On the other hand, the non-denatured fraction 220–352 of ZIKV’s NS1 showed greater specificity to IgG+ sera of ZIKV by dot blot and western blot, which highlights its properties as a possible tool in the diagnosis of ZIKV. Conclusion These findings demonstrate that ZIKV NS1 fraction 220–352 is a potential tool that may be applied in the development of serological diagnosis. We also provided data that suggest the non-applicability of the region 146–182 of ZIKV’s protein E in serological assays despite previous indications about its potential based on computational analysis.


Introduction
The emergence of the Zika virus (ZIKV) in the Americas has raised public health concerns since its first detection in Brazil in 2015 (Campos et al. 2015), mainly due to microcephaly in newborns infected by vertical transmission (Mlakar et al. 2016;Schuler-Faccini et al. 2019;Zhang et al. 2014).
Due to the similarity of the symptoms of the arboviruses such as ZIKV, dengue virus (DENV) and chikungunya virus (CHIKV), laboratory tests are necessary to confirm the clinical diagnosis. Through molecular methods such as RT-PCR, ZIKV can be detected in fluids such as blood/serum, urine, semen, liquor (Agarwal and Chaurasia 2021;Rawal et al. 2016). On the other hand, immunodiagnostic assays face the issue of cross-reactions between flaviviruses, which makes it difficult to distinguish between them and leads to false-positive results, due to the high similarity between flavivirus proteins (Lee et al. 2017;Wen and Shresta 2019). During flavivirus infection, the host organism can trigger the memory immune response to a flavivirus of previous infection, due to the recognition of epitopes in common, which exacerbates the issue of cross-reactions (Rodriguez-Barraquer et al. 2019;Rogers et al. 2017).
Therefore, the difficulties related to the cross-reactions that occur in the immunodiagnostics of ZIKV increase the demand for more specific methods for immunodiagnostics. Among the immunogenic proteins of the ZIKV, the E protein and non-structural protein 1 (NS1) have a prominent role. In a previous study, the antigenicity of the E protein epitopes was mapped, highlighting the sensitivity of the E protein's D1 domain (Akhras et al. 2019). Also, studies present NS1 as a soluble protein that is secreted in large amounts into blood plasma (Scaturro et al. 2015) and is responsible for triggering a greater humoral response. In a study of resistance to ZIKV infection in the Thai population, a correlation of reactivity to NS1 protein to sera with neutralizing antibodies was demonstrated (Sornjai et al. 2019), which highlights the antigenic potential of NS1.
Thus, in this study, we sought to produce and characterize two recombinant proteins derived from specific regions of ZIKV, based on projections from bioinformatics analysis, to assess their potential use as a tool for differential detection of ZIKV in serological assays.

Conservation and antigenicity prediction analysis
To identify ZIKV-specific peptides, multiple sequence alignments (MSA) were performed, using amino acid sequences from the E and NS1 proteins of ZIKV, YFV, Ilheus virus (ILHV), Saint Louis encephalitis virus (SLEV), Japanese encephalitis (JEV), WNV, and the four dengue virus serotypes (DENV1-4). The MSA were performed using the MUSCLE tool in MEGA X software and were visualized in the GENEDOC software to highlight the non-conserved regions among the flaviviruses. The sequences used in the alignments are described in Supplementary Table S1. From the highlighted regions, ZIKV peptides were selected for identity analysis by BLASTp, from the National Center of Biotechnology (NCBI). The same regions selected for identity analysis were also evaluated on epitope prediction test by linear epitope prediction and Emini surface accessibility prediction methods of the servers: B-cell epitope prediction server (BCPREDS) (Chen et al. 2007;El-Manzalawy et al. 2008) and scale-based b cell epitope prediction from the IEDB Analysis Resource (Ponomarenko and Bourne 2007).

Genetic constructs, solubility prediction, transformation, and expression
Two genetic constructs were designed for the expression of recombinant proteins: i) the first constitution is a sequence of five in tandem repeats (uninterrupted repeats) of the peptide coding region of region 146-182 of the ZIKV_E protein (Tan_E), ii) while the second construction express the region 172-352 of the ZIKV NS1 protein (partial NS1-pNS1). Both coding regions were optimized for expression in E. coli, commercially synthesized (GeneScript ®) and cloned into pET28-a from the BamHI and NdeI restriction sites.
The gene sequences to express Tan_E and pNS1 in pET28-a, and its respective amino acid sequences are shown in Supplementary Figure S1A. The solubility of the recombinant proteins was estimated using the Protein-sol tool (https://protein-sol.manchester.ac.uk/) (Protein-sol, University of Manchester) (Hebditch et al. 2017) as shown in Supplementary Figure S1B. pET-28a plasmids were used to transform lineage competent E. coli BL21(DE3) Star™ (Promega ®) and C41 (Sigma-Aldrich) by heat shock. The expression induction was carried out under an optimized condition of addition of 0.5-1 mM isopropyl β-d-1-thiogalactopyranoside (IPTG) during bacterial growth at 37º C, optical density (OD) between 0.7 and 0.9, in a volume of 250 mL, under agitation (culture in 1 L Erlenmeyer flasks). After 3 hours of culture, the bacteria were centrifuged at 4000 rpm/20 min and resuspended in 30 ml of phosphate buffered saline (PBS). The expressions of recombinant proteins were confirmed by SDS-polyacrylamide gel electrophoresis (SDS-PAGE).

Solubility test and purification of recombinant proteins
To evaluate the recombinant proteins solubility, induced cells were lysed in PBS pH 7.4 by sonication at 50 Hz for 1 minute, with 10 second intervals on ice. The lysates were fractionated by centrifugation at 6000 RPM/20 min at 8ºC, where the pellet phase contained the insoluble proteins and the supernatant the soluble proteins, subsequently observed by SDS-PAGE. For resuspension of insoluble proteins, the pellet was resuspended in AKTA wash buffer with urea (20 mM NaH2PO4; 0.5 M NaCl; 3.5 mM Imidazole; 8 M Urea; pH 8.0) and centrifuged at 6000 RPM/60 min for supernatant collection.
Dialysis processes were also carried out to remove imidazole and urea. The dialysis of soluble proteins was performed in 1 L dialysis buffer (Tris 100 mM; NaCl 300 mM), followed by partial exchanges of 200 mL of dialysis buffer (every hour) until 5 L of buffer was used; while the dialysis of insoluble proteins was performed in 1 L dialysis buffer with urea (100 mM Tris; 300 mM NaCl; 700 mM Urea), followed by partial exchanges of 200 mL of dialysis buffer (every hour) until use of 5 L of buffer (100 mM Tris; 300 mM NaCl).

SDS-PAGE analysis
Samples were denatured in Laemmili reagent (100 mM Tris HCl pH 6.8, 4% SDS, 20% glycerol, 0.2% bromophenol blue) and denatured in boiled water for 8 minutes. Electrophoresis was conducted at 40 V during the packing gel (4%) and 100 V across the separating gel (12%), using the electrophoresis buffer (25 mM Tris base, 192 mM glycine, 3.5 mM sodium dodecyl sulfate (SDS)). At the end of the run, the proteins were fixed and stained on the gel with a dye solution (50% methanol, 10% acetic acid, 0.1 g Coomassie blue G-250, for 100 mL), in a water bath at 55°C for 30 minutes, followed by consecutive washings in distilled water in a water bath at 55°C for 15 minutes. The estimated molecular weights were estimated considering the electrophoresis of the molecular weight marker Precision Plus Protein™ Dual Color Standards (Bio-Rad). Alternatively, the non-denaturing SDS-PAGE followed the same methodology, without SDS and thermal denaturation.

Sera samples used for reactivity characterization
Serological assays were performed using characterized sera for the presence of IgG antibodies reactive to ZIKV, DENV or non-reactive. The samples came from the serological collection of the Virology Laboratory of UFBA and Hospital Aliança, Salvador-BA. Sera collected during the period 2015-2016 were first selected by the positivity for ZIKV and negativity for DENV by RT-qPCR (to ensure the immune response for those sera derives from a ZIKV exposure from a natural infection) and ZIKV positive result for IgG by indirect ELISA using ELISA: Zika virus IgG, Euroimmun (cat. EI 2668 G), according to the manufacturer's recommendations. Positive sera for IgG detected by ELISA using Serion ELISA classic Dengue Virus IgG and IgM (cat. ESR 114 G, cat. ESR 114 M), collected between 2009 and 2011 (period before the first outbreak of ZIKV in Brazil), were also used. Sera with negative results for both ELISA tests for Zika and Dengue were used as negative controls.

Dot blot analytics
The dot blot technique was performed using the Hybond-C Extra nitrocellulose membrane, GE Healthcare Life Sciences, 0.45 μm (cat. RPN203E). Proteins were quantified by spectrophotometry at 280 nm absorbance, with a BSA standard, and diluted to 0.3 μg/μL. The nitrocellulose membrane was equilibrated in transfer buffer and placed in a microwell vacuum vessel for transfer. Each well was sensitized with 50 µL (15 µg) of protein. After the transfer, the membrane was dried in an oven at 37°C for 15 min and blocked in PBS-5% milk (Skim Milk Powder, SIGMA-ALDRICH, cat. 70156-500 G) for 1 h and 20 min, with stirring. The membranes were then incubated with diluted serum (1:100) in PBS-0.5% milk for 90 minutes at 37°C. The membranes were then washed three times with PBS-tween 0.05%, for 10 minutes, under agitation, and incubated with anti-IgG -peroxidase: antihuman conjugated antibody (Zymax, Invitrogen), diluted 1:500 in PBS-0.5% milk, for 90 minutes at 37°C. The stainings were revealed after three washes with PBS-tween 0.05%, for 10 minutes and one with PBS 1×, and incubated for 10 minutes in revelation solution. The reaction was stopped with successive washings in distilled water. The assays were conducted on separate groups of the previously mentioned sera. From these, the sample numbers for anti-ZIKV, anti-DENV and control sera groups were, respectively, 26, 20, and 14 for Tan_E assays; and 12, 8, and 9 for pNS1 assays.

Chromatographic density quantification and statistical analysis
The quantification of the dot blot stainings was obtained from high-resolution images, which were adjusted to brightness and contrast standards and converted to 8-bit grayscale using the ImageJ program. The software was configured to measure the density of the markings through graphic grayscale and quantification of the area of the projections. Area values were evaluated for normal distribution by the Shapiro-Wilk test and by histogram plotting with normal curve overlay, using the PAST 4.0 program. As they present a normal distribution and multiple-analysis groups, the significance of the difference between groups was calculated by one-way ANOVA and Tukey's pairwise tests, where p < .05 values were considered significant.

Prospecting for ZIKV-specific antigens
To identify ZIKV-specific peptides, we performed bioinformatic analysis to characterize important aspects such as amino acid sequence specificity, antigenicity based on epitope prediction, and solubility predictions of recombinant proteins derived from ZIKV-specific regions.
The homology analysis highlighted low conservation among flaviviruses in five small regions of the E proteins: 81-97; 146-182, 226-241, 273-286, 364-374, and a wide region of NS1 (172-352) as regions of low conservation between different viruses in the family. Among the regions of low conservation of the ZIKV E protein sequence, the peptide 146-182 (pt146-182E) stood out for being the largest among the others (Supplementary Figure S2A). When analyzing the identity of the peptides: pt146-182E; pt273-286E; pt364-374E they do not presented significant similarity with the viruses: DENV, SLEV, YFV, WNV (Supplementary Table S1), according to the BLASTp default parameters. Analyzing the homology between the different strains of ZIKV, pt146-182E showed high conservation among them, with only one-point mutation in relation to the African strain of ZIKV (Supplementary Figure S2B). As an alternative to the strategy of using small peptides, a wider region was also analyzed by aligning amino acid sequences of the NS1 of the flaviviruses, where the peptide fraction of region 172-352 of the NS1 of the ZIKV (pt172-352NS1) is found. pt172-352NS1 showed homology to equivalent regions of different flaviviruses (Supplementary Figure S2C). Through identity analysis, the pt172-352NS1 sequence showed complete identity to ZIKV, despite having relative closeness to other flaviviruses such as Spondweni virus (SPOV) (76% identity), DENV3 (62.98% identity) and DENV1 (62 .43% identity) (Supplementary Table S1). Regarding conservation, pt172-352NS1 showed high homology with sequences from different strains of ZIKV, presenting only three-point mutations (Supplementary Figure S2D).
Seeking to verify the antigenicity of the previously mentioned regions, linear epitope prediction was performed using the Linear Epitope Prediction method and analysis of the accessibility of epitopes by Emini Surface Accessibility Prediction, of the complete amino acid sequences of the E and NS1 proteins. All five of the previously highlighted peptides derived from ZIKV protein E were predictively antigenic, with the most predictively antigenic peptide being pt146-182E. On the other hand, the same analysis highlighted that the ZIKV NS1 fraction 172-352 have greater antigenicity compared to the whole protein (Supplementary Figure S3).

Expression and purification of Tan_E and pNS1
Based on the bioinformatics analysis, we selected the pt146-182E and pt172-352NS1 regions for recombinant peptide expression in prokaryotes. Two genetic constructs were designed for recombinant protein expression using different strategies. The first recombinant protein came from an antigenicity amplification strategy, where the peptide pt146-182E was expressed in tandem of five repeats (Tan_E). The second production strategy consisted on the partial expression of NS1 from ZIKV (pNS1), where the 172-352 fragment was expressed without repeats.
The coding regions of the recombinant proteins Tan_E and pNS1 were cloned into pET-28a for further transformation in expression bacteria. After induction tests, we obtained a better production of Tan_E after 3 h of incubation with 0.5 mM of IPTG, having as OD initial 0.6 and 0.8. Through electrophoretic run on SDS and polyacrylamide gel (SDS-PAGE), the expression of Tan_E with a molecular weight of 34 kDa was observed. The phenomenon understood as "leakage of gene expression" was also observed, where, even without induction, the transformed organism is able to express the recombinant gene ( Figure 1a). We were also able to standardize the expression of pNS1 in C41 induced for 3 hours, using 1 mM IPTG from OD between 0.6 and 0.8, observed in SDS-PAGE where presented a band of 25 kDa on the inducted bacteria test (Figure 1b).
To test the solubility of the recombinant proteins, the induced bacteria were lysed, resuspended in phosphate buffer, and centrifuged to separate the phases: soluble (supernatant) and insoluble (pellet). The assays were revealed in SDS-PAGE where it was possible to detect the presence of Tan_E (34 kDa) in the supernatant fraction, pointing out its soluble character (Figure 1c), while pNS1 (25 kDa) was present in the insoluble fraction, which characterizes it as an insoluble protein (Figure 1d). After treatment of the induced bacteria extract with 8 M urea resuspension buffer, it was possible to solubilize the pNS1, which in a new solubility test showed a soluble character (Figure 1e). Both recombinant proteins, in soluble fraction, were purified by histidine-tag affinity chromatography. After elution washes, we obtained a high purity of Tan_E from the third elution, while pNS1 was purified in the fourth elution step. Then the eluted material was then dialyzed to a higher degree of purity, as seen on SDS-PAGE (Figure 1f,g).

Evaluation of the reactivity and specificity of non-denatured Tan_E and pNS1
The reactivity of non-denatured Tan_E and pNS1 proteins was evaluated through dot blot assays with sera previously characterized for the presence of anti-ZIKV and anti-DENV IgG by immunochromatography, in addition to subsequent analysis of the colorimetric density of the stainings (Figure 2). In the assays where the entire ZIKV was used as antigen, it was not possible to observe a significant difference between the anti-ZIKV and anti-DENV IgG+ sera, although both presented differences with the negative control sera (Figure 2a). Also, when using the Tan_E antigen in the dot blot assays, it was not possible to observe any difference between any of the three test serum groups (Figure 2b). On the other hand, in the dot blot assays using pNS1 antigen, the density of stains using anti-ZIKV IgG sera was higher than those with anti-DENV IgG+ sera, which remained unchanged compared to staining with negative control serum (Figure 2c).

Figure 2.
Tan_E and pNS1 reactivity by dot blot and colorimetric density analysis. Dot blot assays using whole ZIKV (evidencing the occurrence of cross-reaction on ZIKV and DENV serology), Tan_E and pNS1 for analysis of reactivity to anti-ZIKV IgG+ sera, anti-DENV IgG+ sera, and negative control sera (anti-ZIKV /DENV IgG-). (b) Analysis of the estimated colorimetric density of the dot blot labels using the full-length ZIKV, (c) the recombinant protein Tan_E, and (d) pNS1. The data showed normal distribution by the Shapiro-Wilk method and histogram plotting. Statistical significance was observed by One-way ANOVA and Tukey pairwise when p < .05 (*), p < .01(**), or p < .001(***).

Evaluation of the reactivity and specificity of denatured Tan_E and pNS1
To verify the reactivity of linear epitopes of the Tan_E and pNS1 antigens with anti-ZIKV and anti-DENV IgG+ sera, they both were submitted to denaturing treatment and western blot assay.
The western blot assay using purified and dialyzed Tan_E did not show immunological stains used in any of the tested sera groups, despite showing a nonspecific staining in one of the control sera (Figure 3a). Among the anti-ZIKV IgG+ sera, the three sera that presented the highest colorimetric density values in the dot blot assays were selected for re-analysis by western blot, comparing the protein profile of purified Tan_E, of the bacteria induced to produce Tan_E and control (not transformed). Again, no immunological reactivity with Tan_E was observed in the three tested sera (Figure 3b). . Tan E and pNS1 reactivity in western blot assays. (a) Western blot of purified Tan_E, using sera reactive for ZIKV, DENV, and negative control (non-reactive sera). (b) Western blot of Tan_E protein and controls (induced and non-induced bacteria), using the most reactive sera from the previous dot blot assay on Tan_E. (c) western blot using linear (denatured) pNS1 with sera reactive for ZIKV, DENV, and non-reactive, highlighting the pNS1 could not be detected. (d) western blot using confirmational (nondenatured) pNS1 with sera reactive for ZIKV, DENV, and non-reactive, evidencing the pNS1 detectability by ZIKV reactive sera.
The reactivity of linearized pNS1 was also evaluated by western blot. The denaturing immunoassay revealed that it was not possible to detect linearized pNS1 in any of the groups of sera tested (Figure 3c). However, the same assay using the conformational (nondenatured) pNS1 revealed immunological stains corresponding to pNS1 in the seven anti-ZIKV IgG+ sera, but also in two of the seven anti-DENV IgG+ sera, while none of the negative control were reactive (Figure 3d).

Discussion
The bioinformatic analysis that we performed pointed out regions of low identity between the amino acid sequences of the E and NS1 proteins of the ZIKV. Among the regions, we highlight that a fraction of D1 of protein E, at position 146-182, showed predictive antigenicity and specificity. Those findings were in agreement with other two computational studies that applied different methodologies and tools, what also presented the ZIKV E protein 146-182 region as of interest due to immunogenicity and sequential uniqueness to ZIKV (Amrun et al. 2019;Fumagalli et al. 2021;Lee et al. 2017), and increased the need to make an experimental investigation. Through animal hyperimmunization assays, using ZIKV's DI domain of the E protein, the affinity of IgG antibodies for the 146-182 region of the E protein of the ZIKV was demonstrated. The study also highlighted that the antibodies produced did not cross-react with other flaviviruses (Akhras et al. 2019). In view of the projections about the potential use of the 146-182 region of the ZIKV protein E as a tool for the development of immunodiagnostic assays, the present work collaborated to elucidate the practical application of the mentioned region, evidencing its serological reactivity based on the expression and characterization of an antigen with five sequential repeats for the amplification of its epitope bioavailability.
Through a dot blot analysis of the reactivity of Tan_E, inferences could be made about the reactivity of the 146-182 region of the E protein, and about the cross-reactions between flaviviruses. First, the occurrence of cross-reaction between the flaviviruses was evidenced in the dot blot assay with the complete ZIKV, where no difference was observed between the Zika and Dengue sera. The reactivity of non-denatured Tan_E was not observed using dot blot methods, where the measured parameters did not show divergent values between the tested serum groups, including the negative control group. To verify the correlation of previous results with the reactivity of the Tan_E protein, we also performed a western blot assay where we found that it was not possible to detect the recombinant protein using the immunoreactive sera. The results of the experiments with Tan_E suggest that the 146-182 region of the E protein of the ZIKV is not sensitive to IgG from immunoreactive human sera derived from natural infection, which contrasts with previously performed computational surveys (Amrun et al. 2019;Fumagalli et al. 2021;Lee et al. 2017). Considering it is a small region, the high specificity desired could lead to an imbalance between immunological sensitivity and specificity, as well as the choice of a large fragment could increase sensitivity at the cost of compromising specificity (Mcnamara and Martin 2018).
Our computational analyzes also pointed to unique regions of ZIKV in the NS1 protein, where we highlight the possibility of using the meso-terminal fraction of the protein for experimental evaluation of antibody reactivity. Accordingly, other bioinformatic analysis also presented the meso-terminal fraction of the NS1 of ZIKV as a peptide of potential application in immunoassays because of its predictive sensitivity and specificity (Amrun et al. 2019;Lee et al. 2017). The proposed use of pNS1 could lead to an increase in specificity compared to the whole NS1. In relation to the other ZIKV proteins, NS1 has greater specificity and has been used experimentally for the differential diagnosis of ZIKV infection (Sornjai et al. 2019). Even so, immunological cross-reactivity has also been described for the ZIKV NS1 protein, and genetic improvements have already been made to increase specificity (Yap et al. 2021). BLASTp identity analysis also highlighted the high amino acid sequence identity of the pNS1 antigen with DENV. However, despite the high similarity with DENV, the NS1 is less cross-reactive in comparison with other ZIKV's proteins (Sornjai et al. 2019). This observation could also reflect in the differential immunodiagnosis using the pNS1 that we present. In a computational study where linear and conformational epitopes of flaviviruses were analyzed, the ZIKV NS1 fraction 172-352 stood out as one with great number of ZIKV-specific regions, despite the presence of two common regions for immune recognition to DENV-IgG (Fumagalli et al. 2021). In this way, our proposal for the expression of pNS1 could direct the development of sensitive tests with a superior specificity than those that use the NS1 protein as an antigen for the detection of ZIKV.
Despite the sequential amino acid similarity (presented in the supplementary Table S1) the NS1 had just a limited number of epitopes in common to DENV (Amrun et al. 2019;Lee et al. 2017), which reflected on the capability to distinguish them on the experimental tests. From the pNS1 expression assays, we observed the insoluble character in PBS pH 7.4, which differs from computational projections where it would have a soluble character with a predictive solubility of 52.9% (soluble when >45%). Therefore, the strategic use of the 8 M urea resuspension technique enabled the purification of pNS1 through histidine tail affinity chromatography, as similar methodologies have been reported in the literature (Bush et al. 1991;Hochuli 1990;Patra et al. 2000).
Analyzing the dot blot markings for pNS1, we observed that it showed greater reactivity to anti-ZIKV IgG+ sera, which suggests greater specificity for anti-ZIKV sera. Even with the statistical difference observed in dot blot assays, we still observed markings in assays with anti-DENV IgG+ sera, both in dot blot and western blot, which could be attributed to common epitopes for both pathogens (Fumagalli et al. 2021). The greater specificity of pNS1 to anti-ZIKV IgG+ sera highlights its potential use as a tool for immunodiagnostics in Zika serology, circumventing the use of the entire NS1, which has been described as a cross-reactive protein in the serology of other flaviviruses (Felix et al. 2017;Muller and Young 2013). In a previous study, the application of genetic engineering was applied to increase the specificity of the NS1 protein in order to increase the specificity to ZIKV in immunodiagnosis (Yap et al. 2021).

Conclusion
In general, our study contributed to highlight regions with potential applications in the development of specific peptides for ZIKV through computational analysis. From experimental assays, we demonstrated that the 146-182 region of the ZIKV protein E does not have sufficient sensitivity for its application in serological diagnosis, despite favorable indications obtained through computer analyses. We also demonstrated that the use of partial NS1 expression of ZIKV, in its fraction 172-352, showed greater specificity to ZIKV in non-denaturing serological assays, which enabled its use in serological diagnosis.