Conformational preferences of triantennary and tetraantennary hybrid N-glycans in aqueous solution: Insights from 20 μs long atomistic molecular dynamic simulations

Abstract In the current study, we have investigated the conformational dynamics of a triantennary (N-glycan1) and tetraantennary (N-glycan2) hybrid N-glycans found on the surface of the HIV glycoprotein using 20 μs long all-atom molecular dynamics (MD) simulations. The main objective of the present study is to elucidate the influence of adding a complex branch on the overall glycan structural dynamics. Our investigation suggests that the average RMSD value increases when a complex branch is added to N-glycan1. However, the RMSD distribution is relatively wider in the case of N-glycan1 compared to N-glycan2, which indicates that multiple complex branches restrict the conformational variability of glycans. A similar observation is obtained from the principal component analysis of both glycans. All the puckering states (4C1 to 1C4) of each monosaccharide except mannose are sampled in our simulations, although the 4C1 chair form is energetically more favorable than 1C4. In N-glycan1, the 1–6 linkage in the mannose branch [Man(9)-α(1-6)-Man(5)] stays in the gauche-gauche cluster, whereas it moves towards trans-gauche in N-glycan2. For both glycans, mannose branches are more flexible than the complex branches, and adding a complex branch does not influence the dynamics of the mannose branches. We have noticed that the end-to-end distance of the complex branch shortens by ∼ 10 Å in the presence of another complex branch. This suggests that in the presence of an additional complex branch, the other complex branch adopts a close folded structure. All these conformational changes involve the selective formation of inter-residue and water-mediated hydrogen-bond networks. Communicated by Ramaswamy H. Sarma Graphical Abstract


Introduction
Carbohydrate interactions are one of the most prevalent interactions at the cellular level (Werz et al., 2007). More than ten types of monosaccharides exist in our cells, forming various kinds of glycans depending on the variability of linkages and stereoisomers. The rigid sugar rings are attached via flexible glycosidic linkages that make the glycan molecules a multi number coexisting conformer in the solutions (Hardy, 1997;Woods, 2018). Along with this, several rotamers via those linkages and their potential acceptor and donor hydroxyl group for hydrogen bond increases glycan conformational complexity. All these glycan molecules are involved in several biological processes such as protein quality control (Helenius & Aebi, 2004;Lederkremer, 2009), protein trafficking (Guo et al., 2004;Shi & Elliott, 2004), protein stability (Chen et al., 2010;Culyba et al., 2011;Hanson et al., 2009;Sol a et al., 2007), immune response (Jefferis, 2009), glycanreceptor recognition (Gabius, 2000;Gabius et al., 2011;Wormald et al., 2002), as well as in several commercial roles like biocompatible and biodegradable materials (Slaney et al., 2011), design of human vaccine (Bernardi et al., 2013;Burton et al., 2012) and other pharmaceutical uses (Shukla & Tiwari, 2011;. Most of these oligosaccharides are covalently attached to the nitrogen atom of specific asparagine residue via N-glycosylation in the endoplasmic reticulum (ER) (Schwarz & Aebi, 2011). To examine the glycan recognition by viral pathogens (R. Raman et al., 2016;Shanker et al., 2017), or the recognition of viral glycoproteins by different lectin and antibodies specifically broadly neutralizing antibodies (bNAbs) (Krauss, 2016;Pierce et al., 2016;Shcherbakov et al., 2015) required a high level of structural knowledge of those glycan molecules, which is not clear in our current scenario. Therefore, the characterization of structural and conformational variability of different N-glycan sequences to address the virus-glycan interactions is the need of the hour.
Recently, the effect of fucosylation, sialyation, and galactosylation on the conformational dynamics of different N-glycans of IgG was studied using cMD (Harbison et al., 2019). Earlier, several complex and high-mannose branched biantennary or triantennary N-glycans were also investigated in an aqueous solution using molecular dynamics simulations (Guillot et al., 2016;Suzuki et al., 2017).
Herein, we have studied the conformational preferences of two hybrid N-glycan sequences (Figure 1(A, B) (Shivatare et al., 2018), with 20 ls long MD simulations. Although sequences are known from the previous work, the 3-dimensional structures have not been investigated yet. The glycan sequences (N-glycan1 and N-glycan2), which are used in our current study, have more than two branches ( Figure 1(A, B)).

MD simulation of glycans
Both the glycan (Figure 1 (A, B)) structures were modeled using the GLYCAM web server (www.glycam.org). N-glycan1 and N-gly-can2 consist of 10 and 13 monosaccharides, respectively. The linkage information was obtained from earlier studies by Shivatare and coworkers (Shivatare et al., 2018). The complete sequence of N-glycan1 and N-glycan2 are Neu5Aca2 -6Galb1 -4GlcNAcb1 - The energy minimized rotamer structures obtained from the GLYCAM webserver were used as the starting structure for MD simulations. MD simulations were carried out using the pmemd.cuda module of AMBER18 (Case et al., 2005). Glycans were modeled using the glycam06j-1 (Kirschner et al., 2008) forcefield. The glycan molecules were solvated in a truncated octahedral water box with a 12 Å buffer from the wall, with 3808 and 3504 TIP3P (Jorgensen et al., 1983) water molecules for N-gly-can1 and N-glycan2, respectively. Although N-glycan2 has three more residues than N-glycan1, a lesser number of water molecules were required to solvate N-glycan2 than N-glycan1. A similar trend was observed in earlier studies also. A suitable number of Na þ ions (1 Na þ for N-glycan1 and 2 Na þ for N-glycan2 ) were added to neutralize the solvated system. The overall system size is comparable for both systems. All bonds involving hydrogen atoms were constrained using the SHAKE algorithm (Ryckaert et al., 1977). The long-range electrostatic interactions were calculated using the particle-mesh Ewald (Darden et al., 1993) method. The non-bonded cut-off was set to 12 Å. The system temperature was kept fixed at 300 K using a Langevin thermostat (Allen & Tildesley, 2017) with a collision frequency of 2 ps À1 .
Each solvated system was minimized in two stages. In the first stage, all the solute atoms were kept fixed, and we conducted 500 steps of steepest descent followed by another 500 steps of the conjugate gradient to minimize the system. In the second stage, the minimization was performed without any restraint on the solute atoms. Subsequently, each system was heated to 300 K in a step-wise manner with a positional restraint of 2 kcal mol À1 Å À2 on the glycan chain in the canonical (NVT) ensemble. Next, under the NPT ensemble, 1.0 ns equilibration was carried out at 1 atm pressure using Berendsen barostat (Berendsen et al., 1984). The same barostat was used in earlier studies of glycans (Balogh et al., 2019;Fogarty et al., 2020). Finally, the unbiased production MD simulation was carried out for 20.0 ls in the NPT ensemble. A time step of 2.0 fs was used for the production run. The last 19.9 ls data, i.e. 1990000 frames for each glycan, were used for the analysis.

Analysis
Most of the analyses were accomplished using the Cpptraj module of AMBER18 (Roe & Cheatham, 2013). The glycan conformation was analyzed in terms of the local glycosidic dihedral angles, u, w, and x. These angles are defined as u ¼ O 5 -C 1 -O-C n , w ¼C 1 -O-C n -C (n-1), and x ¼ O 6 -C 6 -C 5 -C 4 . The generalized pseudo rotation coordinate formulated by Cremer and Pople (1975) was used for quantifying the puckering state of the six-membered pyranose ring. Following the IUPAC convention, the total puckering space was divided into 38 different regions. Further, we also constructed a 1-D free energy surface using the amplitude puckering parameter (h) as a reaction coordinate as used in previous studies (Alibay et al., 2018;Alibay & Bryce, 2019). The conformational free energy for different coordinates was constructed utilizing the following Boltzmann equation.
where, x,y ¼ reaction coordinates, P(x,y) ¼ population of specific bin of the system along reaction coordinate, P max (x,y) ¼ population of the most populated bin of the system along reaction coordinate, k ¼ Boltzmann constant and T ¼ absolute temperature.
To investigate the correlated and anti-correlated motions between heavy atoms of each carbohydrate residue, we generated the dynamic cross-correlation map (DCCM) using the cpptraj (Roe & Cheatham, 2013) module of Ambertools19. We also conducted principal component analysis (PCA) by diagonalizing the covariance matrix constructed by considering only the heavy atoms of carbohydrates that generate a set of eigenvectors and their corresponding eigenvalues representing the set of principal components (PCs). Prior to the calculation, the RMS fit was performed to remove any rotational and translational movement, and the first frame of the trajectory was used as a reference structure. The free energy landscape (FEL) was constructed using the first two or three PCs.
Global conformation was characterized with the help of different sets of end-to-end distances and angles. The occupancy of these parameters was calculated using the kernel density estimation procedure using in-house python scripts. All the plots were generated using in-house python scripts, and the 3-dimensional structures were visualized using ChimeraX (Goddard et al., 2018). We used the distplot function of the seaborn module of python for plotting purposes (https://seaborn.pydata.org/generated/seaborn.distplot.html). It uses the Gaussian kernel density estimation function to smoothen data.

Results and discussion
The result section is divided into local conformational changes in terms of glycosidic angles, sampling of each sixmember rigid ring structure, and the global structural changes by estimating various sets of distances and angles.

Stability and flexibility analysis
Firstly, we have computed the time evolution of root-meansquared deviations (RMSDs) of the ring atoms from the corresponding initial structure of the production run (i.e. the last frame of the equilibration run of 1 ns) and shown in Figure 1C. Since glycans are very flexible in solution, the RMSD distribution can be used to characterize general conformational variability (Jo et al., 2016). The distribution of RMSD of glycans for the entire trajectory is displayed in Supporting Information Figure S1(A). Also, to estimate the convergence of the simulation, the entire 20.0 ms trajectory is divided into four segments of equal length (5.0 ms). We have calculated the RMSD distribution for each segment and shown it in Supporting Information ( Figure S1(B-E)). We have also measured the average RMSD for each segment and listed it in Supporting Information Table S1. The average RMSD in successive parts ranges from 5.3 to 6.2 Å for N-gly-can1, while it stays around 6.2 Å for N-glycan2. This may indicate that N-glycan1 is more flexible than N-glycan2 in an aqueous solution. Further, it is evident from Table S1 that the average RMSD value in successive parts for both glycans lies within the standard deviation. It indicates the convergence of the individual simulation for each system.
In the case of N-glycan1, the peak is located at $ 5.2 Å along with a shoulder at $ 8.0 Å ( Figure S1A). On the other hand, in the case of N-glycan2, conformations with an RMSD of $ 6.0 Å are the most populated. Further, the RMSD distribution is relatively wider for N-glycan1 than N-glycan2, as shown in Figure S1(A). We have also estimated the RMSD value concerning the common core part of both systems with respect to N-glycan1 and shown in Supporting Information Figure S2. The same trend is observed. In this case, the average RMSD value of N-glycan2 is found to be 5.9 Å (0.2), which is similar to the average RMSD value (6.2 ± 0.3 Å) for the complete N-glycan2 sequence. The common part of both systems shows a similar deviation from the initial conformation.
Next, we have investigated the flexibility of each monosaccharide by measuring the root-mean-square fluctuation (RMSF), and shown in Supporting Information, Figure S3. For calculating the RMSF, the RMS fit of heavy atoms was performed to remove any global rotational, translational movement, and the last frame of the equilibration run of 1 ns was used as a reference structure. The terminal mannose and sialic acid display almost double fluctuations compared to other monosaccharides of the same glycan chain. The 4th and 5th mannose (see Figure 1(A, B)), i.e. the branching point of the complex and mannose branches, respectively, show an opposite pattern of RMSF value in N-glycan1 and N-glycan2. In N-glycan1, the 4th mannose is more flexible than the 5th mannose, while the 5th mannose is more flexible than the 4th mannose in N-glycan2.

Conformational propensity of different branches and linkages
Figures 2(A-I) and 3(A-L) depict the free energy maps corresponding to various glycosidic linkages (a, b) of N-glycan1 and N-glycan2. Overall, the free energy profile of the glycosidic linkages is similar for both glycans. Further, to measure the convergence of the simulation, the same was plotted for each 5.0 ms interval and shown in Supporting Information ( Figures S4-S11). The principal minimum for all the linkages stays almost in a similar position in each segment, indicating the convergence of the simulation for both systems. The principal minimum of each glycosidic linkage is reported in Table 1, which is estimated from the entire MD trajectory.
The common body of these two glycans consists of trisaccharides having GlcNAc(2)-b(1-4)-GlcNAc(1) and Man(3)-b(1-4)-GlcNAc(2) linkages. For both glycans, the free energy surfaces of both linkages are characterized by a single large minimum around (À80˚, 105˚) and (À75˚, 110˚) for GlcNAc(2)-b(1-4)-GlcNAc(1) (Figure 2(A, B)) and Man (3) (Figure 3(A, B)), respectively. It suggests that the initial common body maintains a rigid structure during the MD simulation. Similar observations were also found in the u angle distribution for the same linkage of other N-glycans of IgG Fc systems (Harbison et al., 2019), whereas our principal minima for w show opposite values. However, in both systems, these two linkages are comparatively more rigid than others, which is also evident in different human and plant glycans (Fogarty et al., 2020). Further, the hydrogen bond analysis reveals that the conformation corresponding to the principal minimum is guided by two strong hydrogen bonds formed among three saccharides. One of the hydrogen bonds is formed between O5@GlcNAc(2) and O3@GlcNAc (1), with more than 55% occupancy. The other hydrogen bond is formed between O5@Man(3) and O3@GlcNAc (2) with an occupancy of $ 35% (see Table 2).
N-glycan1 has one complex branch starting with the b(1-2) linkage between GlcNAc and Man, whereas N-Glyan2 has two branches having b(1-2) and b(1-4) linkages. The GlcNAc(6)-b(1-2)-Man(4) linkage, which is present in both glycan sequences, shows a broader minimum region in the N-glycan1 sequence compared to N-glycan2. The constituent saccharides of the complex branch are the same in both sequences, namely GlcNAc, Gal, and Neu5Ac, which consist of Gal(7)-b(1-4)-GlcNAc(6) and Neu5Ac(8)-a(2-6)-Gal(7). In the case of Gal(7)b(1-4)-GlcNAc(6), the principal minimum is shifted from (À75˚, 115˚) ( Figure 2F) to (À90˚,100˚) ( Figure 3F), and the area of the principal minimum decreases in N-glycan2. This data suggests that the addition of a secondary arm reduces the flexibility of the existing arm. Also, in N-glycan2, the free energy value of the secondary minimum increases above 1 kcal/mol compared to N-glycan1, which is around (À55˚, À55˚). Comparatively, the second complex branch of N-gylcan2 displays higher flexibility. The same linkage between GlcNAc(11) and Gal(12) in the additional branch samples around u, w ¼ (À70˚, 120˚) ( Figure 3K) as a principal minimum along with the secondary minimum under 1 kcal/mol. It indicates the coexistence of multiple conformations in an aqueous solution. So, the minimum value can change for a specific angle depending on the number of branches.
Next, we investigated the flexibility of the mannose (1-6) branch of N-glycan1 and N-glycan2. This branch is composed of three mannose (Man) residues: Man(5), Man(9), and Man(10). Man(5) is connected to Man(9) and Man(10) via a(1-6) and a(1-3) linkages, respectively (see Figure 1(A, B)). Man (5) is connected to Man(3) via the a(1-6) linkage. The (u, w) map of both linkages for both glycans are shown in Figure 2(H, I) and 3(H, I), respectively. The corresponding free energy minima are reported in Table 1. It is evident from the free energy profile and Table 1 that the conformational dynamics of the mannose (1-6) branch are similar for both glycans. It should be noted here that the minima values for the 1-6 linkages lie in the top 3 probable conformations as obtained from the PDB databases for small glycans (de Meirelles et al., 2020).
On the other hand, the 1-3 linkage of this mannose branch shows a stable and sizeable single minimum around (70˚, À100˚) (see Figures 2I and 3I). Further, the a(1-3) linkage between Man(4) and Man(3), which is the starting point of the complex branch, shows a similar free energy map for both glycans. In both the systems, broad conformation was observed by the wide range of w value which is also evident in another 1-3 branch between Man(5) and Man(10). Similar flexibility was also observed in different biantennary glycans, mainly depending on the w values Re et al., 2011). However, those glycans had only two complex branches still, the nature of this linkage persists in all the different sequences. Finally, all the terminal linkages involving sialic acids show greater flexibility irrespective of several branches. The location of the minimum regions coincides with different sets of N-glycans, specifically of the IgG Fc complexes (Harbison et al., 2019).
Supporting Information, Figure S12 depicts free energy maps for the orientation of the glycosidic linkage of all 1-6 and 2-6 linkages present in our simulation in terms of x and w. Similarly, we have displayed the same free energy maps in each interval (Supporting Information, Figures S13-S16), suggesting the convergence of simulations. All 1-6 and 2-6 linkages sample three conformational clusters, i.e. gauchegauche (gg), gauche-trans (gt), and trans-gauche (tg), as shown in Figure 4. The mannose branch (Man(5)-a(1-6)-Man(3)) samples the gauche-gauche cluster in both sequences, whereas gauche-trans is only found in the case of N-gly-can2. Interestingly, another 1-6 linkage in the mannose branch (Man(9)-a(1-6)-Man(5)) samples different conformation for two sequences. In N-glycan1, it stays in the gauchegauche (gg) cluster, whereas it moves towards trans-gauche (tg) in N-glycan2. In N-glycan2, the Neu5Ac(8)-a(2-6)-Gal (7) linkage samples only a very narrow region with high probability which belongs to the gauche-trans (gt) region. The other a(2-6) linkage of N-glycan2 and the only a(2-6) linkage of N-glycan1 sample the same conformational space as they jump between trans-gauche (tg) and gauche-trans (gt). The flexibility of the same linkage was also observed from the previous study (Harbison et al., 2019;Jo et al., 2016). The transition between different gg states is frequent, which can be easily observed from the time evolution of x, as shown in Supporting Information Figure S17.
Comparison of puckering free energy landscapes between N-glycan1 and N-glycan2 We have computed the puckering free energy profile of each monosaccharide present in both sequences, which comprise of N-Acetyl Glucosamine (GlcNAc), Mannose (Man), Galactose (Gal), and sialic acid (Neu5Ac). All these profiles are computed using the Cremer-Pople angle h, which describe the change of conformation of the sugar ring from the 4 C 1 (h ¼ 0˚) chair form to the 1 C 4 (h ¼ 180˚) chair form via the intermediate boat/skew-boat (h ¼ 90˚) structure. The free energy profile of all monosaccharides is presented in Figure  5. Alongside the free energy profile, all sampled canonical puckering conformations and the Mercator representations  Table 1. The location of principal minima of all the glycosidic linkages (u, w, x) of N-glycan1 and N-glycan2.
The terminal sialic acid (Neu5Ac) in both glycans prefer to stay in the 2 C 5 chair conformation, which again agrees with the crystallographic and NMR studies (de Meirelles et al., 2020;Turupcu et al., 2019). A similar free energy profile is obtained for Gal(7) of N-glycan1 and Gal(12) of N-glycan2. In both cases, the 4 C 1 state is energetically more favorable compared to 1 C 4 , agreeing with the previous NMR studies (Angyal, 1968). However, Gal(7) in N-glycan2 prefers both 4 C 1 and 1 C 4 chair conformations, as is evident from Figure 5F. The time evolution of the h coordinate of Gal(7) of N-glycan2 shows that the initial conformation of this ring is 4 C 1 , which shifts to the 1 C 4 chair after 500 ns of the simulation ( Figure  S0021). Further, it comes back to its initial ring conformation around 8.3 ls. Thus, Gal(7) of N-glycan2 shows a multi-microsecond rate of a shift in the ring puckering. However, this behavior is in contrast to what has been found in experiments. In the experimental study, the switching between two states was not observed (Roslund et al., 2004). This could be an artifact of the force field, as the other two galactose residues (Gal(7) of N-glycan1 and Gal(12) of N-gly-can2) stay in their respective initial chair form.

Dynamic cross-correlation analysis
Next, we calculated the dynamic cross-correlation map (DCCM) for both glycans, and shown in Figure 6(A, B). It is evident from Figure 6 that the overall anti-correlated motion decreases in the more extended sequence (N-glycan2) compared to N-glycan1. In N-glycan1, the terminal sialic acid, Neu5Ac(8), shows a high degree of anti-correlation (white rectangle, Figure 6A) with 3rd, 4th, and 6th saccharides, which are the connecting residues between the main and complex branches. This anti-correlation is diminished because of the additional complex branch (residue 11-13) in N-glycan2 (white rectangle, Figure 6B). Further, it can be observed that several residues in both glycans have a high degree of correlation among themselves only. Those residues are involved in several micro-level conjugated structural movements. The 3rd, 4th, and 6th residues, i.e. the branching mannose of the complex branch and the initial GlcNAc of the same branch, have correlated motions (red dotted box) with the Pearson correlation coefficient of more than 0.5. The other GlcNAc residue (11th) of the second complex branch of N-glycan2 also shows the same level correlation with the group mentioned above (blue dotted box, Figure  6B). Surprisingly, the residues in the mannose branch, i.e. 5th, 9th, and 10th mannose, do not show any correlation between the branching residues, whereas having a high correlation between themselves only in both the sequences (black square, Figure 6(A, B)). Also, the motion of these mannose residues has a slightly anti-correlation relationship with the terminal GlcNAc residues, which is not observed in the case of the terminal residues of the complex branches.

Principal component analysis
We also calculate the principal components (PC) on the heavy atoms of N-glycan to extract the structural variation between the N-glycan sequences by diagonalizing the covariance matrix. The first few eigenvalues describe the collective motion of local fluctuations. Each eigenvector and the corresponding eigenvalue were plotted in the decreasing order and shown in Supporting Information Figure S22. The first 10 PCs were responsible for 86% and 82% of the global motions of N-glycan1 and N-glycan2, respectively. Similarly, in our study, the first two eigenvectors contain 42% and 39% of the overall motion of N-glycan1 and N-glycan2.
We have constructed free energy landscapes using PC1 and PC2 to elucidate the conformational preferences of both glycans (see Figure 7(A, B)). Further, the time evolution of PC1 and PC2 is shown in Figure S0023. N-glycan2 samples a relatively larger conformational basin compared to N-glycan1, whereas the number of minima is more in N-glycan1. N-gly-can1 has a principal minimum around (5, À25) (P1 state) along with a secondary minimum (35,15) (P2 state), connected by a very shallow free energy barrier. There are also a  few minima present, separated by a high free energy barrier indicating the presence of multiple conformations at room temperature. The two dominant conformations are shown in Figure 7A. On the contrary, the free energy map of N-glycan2 displays a global minimum around (35, À5) (R1), which is surrounded by a comparatively high energetic structure. Overall, PCA suggests that the mobility of N-glycan1 is much higher than N-glycan2 because of its smaller size and lesser branching points compared to N-glycan2.
We have also performed PCA with respect to the common heavy atoms between both glycans to investigate the structural changes in the common segment of the structures. Since the first 2 PCs do not represent half of the total motion of either simulation, we construct the free energy map by taking the first three PC components (PC1 vs. PC2, PC1 vs. PC3, PC2 vs. PC3) for both systems. The first three PCs represent more than 50% of the total motion ($54% for N-glycan1, $60% for N-glycan2). The resulting FES is shown in Supporting Information Figure S24. As is the case with the entire glycan, here we also observe multiple minima connected via shallow energy in the case of N-glycan1 ( Figure  S24 A-C) compared to N-glycan2 ( Figure S24 D-F). So, both the PCA indicate a similar finding which is also supported by our previous analysis.

Conformational dynamics of glycan arms
To examine the movement of different branches and threedimensional structural features of the N-glycan chain, we have calculated the end-to-end distances between the 1st residue and the terminal residue of each branch, as well as branching angles. Overall, we have estimated four different distances, namely Distance1: GlcNAc(1) and Man(9); Distance2: GlcNAc(1) and Man(10); Distance3: GlcNAc(1) and Neu5Ac(8) and Distance4: GlcNAc(1) and Neu5Ac(13) (see Figure 8A). The first three distances are common in both glycans. The fourth distance is relevant for N-glycan2 only. The distribution of all the end-to-end distances is shown in Figure 8(B-E). It is evident from Figure 8B that the distance between GlcNAc(1) and Man(9) or Distance1 fluctuates between two states around 10 Å and 19 Å in both N-glycans. However, the intensity of the peak is different for both glycans. For N-glycan2, both peaks are almost equiprobable, while the peak at 19 Å is more probable than the other peak for N-glycan1. On the other hand, for both glycans, the other mannose branch displays a rigid behavior since the end-toend distance (Distance2) is characterized by a single peak located at $ 15 Å ( Figure 8C). In the case of N-glycan1, the distribution of Distance3 is characterized by a broad peak located at $ 25 Å, while the same peak shifts to a lower value ($ 15 Å) for N-glycan2 ( Figure 8D). This suggests that the addition of the b(1-4) arm in N-glycan2 restricts the existing b(1-2) arm's motion, making a compact rather than a linear structure. Finally, the average value of Distance4 is $ 20.0 Å since the distribution of the distance is characterized by a single peak located at $ 20.0 Å ( Figure 8E).
Apart from the end-to-end distances, three branching angles (see Figure 9A) are calculated. These branching angles are namely, Angle1: Man(9)-Man(5)-Man(10) (between mannose branches); Angle2: Man(5)-Man(3)-Man(4) (between mannose and complex branches) and Angle3: GlcNAc(11)-Man(4)-GlcNAc(6) (between complex branches). The distribution of these angles is shown in Figure 9(B-D). It is evident from Figure 9B that the branching angle between two mannose branches (Angle1) for both glycans shows a bi-modal distribution similar to the corresponding distribution of Distance1. So the mannose branch fluctuates between two states irrespective of the number of complex branches, whereas the intensity of both peaks is different for both glycans. In N-glycan1, the state around 110˚has almost double occupancy compared to the state around 160˚, whereas both states have almost similar occupancy in the case of N- glycan2. On the other hand, the primary branching angle, i.e. Angle2, behaves similarly in both glycans ( Figure 9C). We observe a sharp peak around 110˚, indicating a rigid structure of the pentasaccharide core in both cases. A similar trend is observed for the angle between two complex branches, i.e. Angle3 (see Figure 9D).
Finally, we have determined the curvature/bending of the two complex branches by measuring the angle among the tri-saccharides of each complex branch. These two angles are, namely, Angle4 (GlcNAc(6)-Gal(7)-Neu5Ac(8)) and Angle5 (GlcNAc(11)-Gal(12)-Neu5Ac (13)). The angular distribution of the b(1-2) arm of N-glycan1 ( Figure 9E) and the b(1-4) arm of N-glycan2 ( Figure 9F) is identical. For both cases, the distribution displays a sharp peak at 110˚along with a hump around 75˚. It suggests the flexible nature of the complex branches. On the other hand, in the case of N-glycan2, the distribution of Angle4 shows a sharp peak at $ 57˚and a secondary peak at 110˚( Figure 9E). This contrasts with what has been observed for N-glycan1. It suggests that the b(1-2) arm in N-glycan2 adopts a more compact and folded structure compared to the corresponding arm of N-glycan1. This agrees with the end-to-end distance, Distance3. This compact structure of the b(1-2) arm in N-glycan2 is achieved not because of the backward movement of the whole arm. Instead, it is because of the curved/folded conformation of the b(1-2) complex arm. Overall, our study suggests that the structural features of the complex branches changes upon the addition of another arm. It can be noted here that this Figure 8. Distribution of end-to-end distance corresponding to different branches of N-glycan1 and N-glycan2. A) schematic diagram displaying various angles B) distance between GlcNAc(1) and Man(9) (Mannose branch), C) distance between GlcNAc(1) and Man(10) (Mannose branch, D) distance between GlcNAc(1) and Neu5Ac(8) (Complex branch) and E) distance between GlcNAc(1) and Neu5Ac(13) (Complex branch). N-glycan1 and N-glycan2 data are shown in purple and green colors, respectively. nature is not transient and results from the interplay between different hydrogen-bond networks.
Furthermore, the cluster analysis was performed for both glycans using the k-means algorithm to get the most probable structures. Major conformations and their relative percentages are shown in Figure 10. The major conformation of N-glycan1 (38.3%) shows a typical curvature along its mannose branch, whereas the complex branch adopts a linear structure. This is also evident from the distance analysis, specifically in Distance1 and Distance2. On the other hand, the major conformation of N-glycan2 (39.3%) shows an asymmetric conformation of its two complex branches. Specifically, one of the branches has a typical sharp bending form between GlcNAc and sialic acid residues, which is already found in our distance and angle calculations, i.e. Distance3 and Angle4. For N-glycan2, all three major conformations show similar curvature of the one complex branch. So the major changes in the end-to-end distance and the angle complement the cluster analysis, which displays the global change of both structures.

Inter residue and water-mediated hydrogen bond networking
The conformation of oligosaccharides relies on hydrogen bonds (Almond, 2005;Ellis et al., 2012). We have performed the hydrogen bond (H-bond) analysis for both glycans and listed all the critical inter-residue hydrogen bonds in Table 2. As water plays a critical role in the conformational dynamics of carbohydrates, we have also investigated the propensity of water-mediated hydrogen bonds and reported in Table 3. Further, we have displayed both types of hydrogen bonds in a schematic diagram in Supporting Information, Figure  S25(A, B). In Figure S26, we have displayed the 3 D conformer of both glycans depicting water-mediated H-bonds. As discussed already, the initial tri-saccharides stay stable because of the formation of a higher percentage of interresidue and water-mediated H-bonds (see Tables 2 and 3).
The mannose branch's interaction pattern primarily depends on the water-mediated H-bonds with an occupancy varying between $ 10-15% (see Table 3). We have not observed any direct inter-residue H-bond for the mannose branch in both glycans. However, direct inter-residue Hbonds are found in complex branches of both glycans. The complex branch shows a moderately stable H-bond between GlcNAc(6) and Gal (7) with an occupancy of 25.35% and 23% for N-glycan1 and N-glycan2, respectively. In the case of N-glycan2, Gal (7) forms two H-bonds with GlcNAc(6) while a single H-bond is observed between these two residues in the case of N-glycan1. Further, the water-mediated H-bond between Gal(7) and Neu5Ac(8) is more stable in N-glycan2 (48.21%) compared to N-glycan1 (29.52%). In the case of N-   (3), which is found to be absent in the case of N-glycan1.
Because of such H-bond interactions, the b(1-2) branch of N-glycan2 adopts a stable folded conformation than the corresponding arm in N-glycan1.
The hydrogen bond profile of the b(1-4) branch is quite different from the b(1-2) arm in N-glycan2. The terminal Neu5Ac(13) forms a water-mediated H-bond with GlcNAc(6), Gal(12), and GlcNAc (11), with an occupancy varying between 10 and 24%. GlcNAc(11) forms not only direct H-bonds with Man(4) and Gal(12) (see Table 2) but also forms water-mediated H-bonds with the same residue (see Table 3). Because of such an H-bond network formation, the b(1-4) branch of N-glycan2 adopts a linear conformation, which contrasts with what has been observed for the b(1-2) branch of N-glycan2. This result agrees with the analyses of the end-to-end distances and branching angles.
Alongside the water-mediated hydrogen bonds, we have also calculated the pairwise correlation function or the radial distribution function g(r) of water molecules for each monosaccharide (oxygen atoms) of the two glycans and shown in Supporting Information Figure S27(A, B). The overall distribution pattern is similar for both N-glycans, similar to other carbohydrate studies (Jana & Bandyopadhyay, 2012;Roy et al., 2020). The terminal residues, i.e. mannose and sialic acid of both the branches, show a relatively higher peak intensity because of their greater exposure to water molecules. In contrast, the initial GlcNAc(1) residue displays a relatively lower intensity due to the strong hydrogen bond between its neighboring residues. The first peak intensity is found to be the least for Man(3) and Man (5) in both glycans because being branching residues; their hydroxyl groups are less exposed to water molecules. A similar trend was also seen for Man(4) of N-glycan2 after adding one more complex branch in N-glycan2, i.e. the value of g(r) changes from 1 to 0.6. Thus, this information amalgamates with the water-mediated hydrogen bond analysis and portrays the critical role of solvent in shaping the global conformations of glycans in an aqueous solution.

Conclusion
Here, we have performed a conventional MD simulation of two N-glycans found on the surface of HIV glycoprotein in an aqueous solution for 20 ls to examine the change in preferred conformation upon adding a complex branch. We have elucidated different levels of motion starting from the flexibility of glycosidic bonds, sampling of all possible puckering states to the conformational shift at the macro-level, i.e. movement of different branches relative to each other. Overall, the addition of a complex branch reduces the flexibility of N-glycan2 compared to N-glycan1. Man(4) and Man(5), corresponding to the branching point of the mannose and complex branches, respectively, display an opposite fluctuation pattern, as is evident from the root mean square fluctuation analysis. Although the average RMSD value of N-glycan2 is higher than N-glycan1, the distribution is much wider in the case of N-glycan1 than N-glycan2. This again suggests that the addition of a complex branch restricts the overall glycan conformational variability. The principal component analysis also shows the existence of multiple minima in the case of N-glycan1, indicating the flexible nature of N-glycan1. In contrast to N-glycan1, the free energy surface is characterized by a single minimum in the case of N-glycan2. The analysis of end-to-end distances, branching angles, and hydrogen bonding yields the different dynamic behavior of both sequences. A close folded structure of the b(1-2) arm of N-glycan2 is observed, while the same arm adopts a linear conformation in N-glycan1. This illustrates the influence of adding a branch on the overall structural dynamics of a glycan.
The systematic investigation of free glycans in solution plays a key role in understanding the different biologically relevant recognitional postures as glycans are highly flexible. The 3-dimensional conformation of these glycans in the free state will give a proper understanding of the different energetically minimized conformation that can be further used as a part of HIV glycoprotein for elucidating the binding mechanism between glycan epitopes and broadly neutralizing antibodies (bNAbs). Different sets of antibodies like PG9, PG16, PGT128, VRC26 show strong binding affinity with the a(2-6) sialylated branch containing glycans as reported in the experimental study (Shivatare et al., 2018). Also, the two diasialyated antennae of N-glycan2 better recognize PG9 compared to N-glycan1. This study helps identify different structurally relevant units within the glycan chain relevant to the structure prediction from the sequence (Fogarty et al., 2020). Our study can be extended by attaching these N-glycans to HIV glycoproteins and investigating the interaction between glycans and antibodies. This information may serve the initial requirement of potent HIV vaccine design.

Supporting information
Probability distribution of RMSD; PMF of w/x glycosidic angle; Mercator representations of puckering conformations; Time evolution of puckering coordinates; Eigen vector versus Eigen value; Schematic diagram displaying various hydrogen bond interactions.

Disclosure statement
No potential conflict of interest was reported by the authors.

Ethics
There is no human or animal experiment in this study.

Data availability statement
The data that support the findings of this study are available from the corresponding author (PK) upon reasonable request.