Molecular insights into the interaction of apo-lactoferrin with the receptor binding domain of the SARS-CoV-2 spike protein: a molecular dynamics simulation study

Abstract LF is a bioactive protein, derived from colostrum and milk that has been found to possess various immunomodulatory, iron chelating, and antimicrobial properties, especially in its apo-form. Recent studies have demonstrated the functionality of LF in attaching to the S proteins of SARS-CoV-2, thereby preventing it from interacting with the ACE-2 receptor. However, the molecular mechanism mediating the process is poorly understood. In this study, molecular docking and MD simulations coupled with free energy calculations were applied to elucidate the key interaction of apo-LF and its N-lobe and C-lobe derivative forms with the RBD of coronavirus S proteins. This has also been extended into evaluating the L452R mutant, which is associated with the delta variant of SARS-CoV-2. The results demonstrate the efficacy of the apo-LF C-lobe in binding to the RBD of both variants, primarily through electrostatic attractions between the acidic residues of the former and the basic residues of each RBD. Furthermore, due to the additional arginine in the L452R variant, the interaction between the C-lobe and the latter is stronger, resulting in a more favourable binding and tightly bound structure. The simulations highlight that the C-lobe, followed by full-length apo-LF can form a multimeric complex with the RBD of SARS-CoV-2, indicating their potential use as novel therapeutics, particularly the cleaved C-lobe of apo-LF to disrupt the S proteins from binding to the host ACE-2 receptor. Communicated by Ramaswamy H. Sarma


Introduction
At the end of 2019, there was an emerging variant of coronavirus found in Hubei, China, known to induce common flu-like symptoms, including sore-throat, fever, coughing, and difficulty in breathing to serious ailments, such as respiratory failure and sepsis (Department of Health & Human Services Similarities & differences between Flu & COVID-19,19, 2021;Hui et al., 2020). This ongoing corona virus disease was caused by the sister of the original severe acute respiratory syndrome virus of 2002 and was designated as SARS-CoV-2 (Coronaviridae Study Group of the International Committee on Taxonomy of Viruses, 2020). To date, SARS-CoV-2 has continued to disrupt the global health community, resulting in many infections world-wide and causing over 5.57 million deaths as of December 2021 as well as having a profound impact on the economy, leading to numerous job redundancies (Nicola et al., 2020). Consequently, this has led to exhaustive clinical trials and rapid development of vaccines, which could provide adequate protection against SARS-CoV-2 (Kaur & Gupta, 2020). In addition to vaccines, there has been increasing interest in the application of therapeutics, including bioactive compounds in olives (Pitsillou et al., 2020) as alternative measurements to fend off the disease.
LF is a biologically active protein, present in milk that is known for its synergistic effects with various probiotics, including lactic acid bacteria (Chen et al., 2013; and broad-spectrum antimicrobial capacities against pathogenic bacteria, viruses, and fungi (Darmawan et al., 2022;Wang et al., 2019). In its apo-form, this functionality is elevated due to the additional iron sequestering capability, effectively depriving iron from iron dependent pathogens (Kell et al., 2020;Lechowicz & Krawczyk-Balska, 2015). Recent study has shown that the combination of LF encapsulated by liposomes, zinc solution, and vitamin C could delay the onset of COVID-19 symptoms and lessen the timeline of recovery (Serrano et al., 2020). Furthermore, LF was found to be able to interact with the S protein of SARS-CoV-2, thereby, blocking it from binding onto the ACE-2 receptor (Chang et al., 2020;Gallo et al., 2022;Rosa et al., 2021). However, the molecular mechanism, mediating this interaction is poorly understood. In our previous studies, MD simulations have been proven as useful tools in understanding the stability of apo-LF and its interaction with whey proteins under various conditions, including gastric digestion , spray drying , and freeze drying . Accordingly, it is hypothesised that the application of MD simulations could be extended to elucidate key interactions between LF and the viral S protein of SARS-CoV-2.
In the current work, MD simulations have been implemented to elucidate the binding between the aforementioned protein components, with a special emphasis on the interaction of the RBD of the S protein with the apo-form of LF. This has also been extended by not only comparing interaction of the native and mutated (L452R) RBDs of SARS-CoV-2 with apo-LF, but also investigating the interaction between the RBD with smaller proteins derived from apo-LF, comprising the N-lobe and C-lobe. Therefore, this will enable a more in-depth understanding of the mechanism and the specific segment of apo-LF that is most likely to interact strongly with the viral RBD. In addition, examination of the separate C-and N-lobe cleaved proteins will allow prediction of the possibility of employing smaller apo-LF derivatives as potential agents for treating or preventing coronavirus disease. In general, our results illustrate the potential use of apo-LF, in particular the C-lobe as a natural means to bind onto the RBD of SARS-CoV-2 and hence deter the attachment of the RBD of SARS-CoV-2 onto the ACE-2 receptor.

Computational details
All methods, including optimisation of the protein structures were adapted from our previous work  with the following changes:

Structural models and protein-protein docking
The structures of LF and the RBD within the S proteins, representing the native and mutated (L452R) strains of SARS-CoV-2 were obtained from the RCSB Protein Data Bank with a PDB ID of 1BLF, 6M0J and 7ORB, respectively (Lan et al., 2020;Liu et al., 2021;Moore et al., 1997). To generate the optimised multimeric structure, comprising apo-LF and the individual RBD, protein-protein docking was performed in accordance with HDOCK (Yan et al., 2020). HDOCK is a novel hybrid protein-protein docking protocol that combines template-based homology modelling and ab initio calculations. During this procedure, all components except the proteins of interest were omitted, corresponding to true protein-protein interactions. In addition to docking the full-length apo-LF to each RBD, the N and C lobe of the same apo-LF structure were also docked to the latter. The generated structures and poses with the lowest docking score were further evaluated through MD simulations and subsequent free binding energy calculations (Refer to Table S1 in Appendix A, Supplementary Data). In total, there are six docking structures, consisting of apo-LF and its subsidiary forms bound to the RBD of the two types of SARS-CoV-2.

Molecular dynamics simulations and simulation trajectory analyses
MD simulations were performed using GROMACS version 2018.3 (Abraham, 2015;Hess et al., 2008). The topology of each protein was generated using CHARMM27 force field (Bjelkmar et al., 2010). All proteins were then placed separately in a cubic cell with a distance of 17 Å from the centre of the proteins to the edge of the box. These were then solvated with TIP3P water models, along with chloride counterions to neutralise the positively charged systems (Jorgensen et al., 1983). Specifically, 15 chloride ions were added into the box, containing the full-length structure of apo-LF and the native RBD as well as the N-lobe and the latter. 1 chloride ion was put to balance the charge between the C-lobe and original RBD. Similarly, 13 chloride ions were used to neutralise the charge between the mutated RBD and the fulllength apo-LF structure, whilst 16 chloride ions were utilised to offset the interaction between the positively charged Nlobe and the former, leaving 2 chloride ions that were generated to counteract the binding of the C-lobe to the mutated RBD. Modified Berendsen thermostat and iso-tropically controlled Parrinello-Rahman were implemented to equilibrate all simulated systems for 100 ps under NVT (constant Number of particles, Volume, and Temperature) and NPT (constant Number of particles, Pressure, and Temperature) ensembles, respectively (Berendsen et al., 1984;Parrinello & Rahman, 1980;1981;1982). In total, there are six simulated systems, consisting of the full-length apo-LF and its cleaved N-lobe and C-lobe derivatives bound to either the native or mutant RBD of SARS-CoV-2. Each simulation was performed with a random seed and set at 310 K, a physiological temperature of the human body for 100 ns with a timestep of 2 fs. VMD was used to evaluate and visualise the trajectory obtained from the MD simulations (Humphrey et al., 1996). MM-PBSA methods were computed to estimate the Gibbs free energy (Baker et al., 2001;Open Source Drug Discovery Consortium, 2014). Van der Waals, electrostatic, and polar solvation energy contributions were calculated from the adaptive Poisson-Boltzmann solver, whilst SASA was used to approximate the non-polar solvation energy, with a grid spacing of 5 nm. Values of 80 and 2 were selected as the solvent and solute dielectric constant, respectively, as described in the g_mmpbsa tutorial package. In this calculation, the coordinate of each generated multimeric complex was saved for every 10 frames. Three 1000 ps segments from 99 to 100 ns, 89 to 90 ns, and 69 to 70 ns were selected as the timeline for the MM-PBSA calculation to ensure sampling of the trajectory during periods of time when the complex undergoes different structural changes (detailed in Results below) during the binding process throughout the simulation (Hou et al., 2011). In addition, complementary pairwise sequence alignment was performed to identify possible regions in apo-LF that may bind onto the RBD of SARS-CoV-2 S protein. Needleman-Wunsch algorithm as implemented in EMBOSS Needle was employed to produce the global optimal alignment (Madeira, 2022;Needleman & Wunsch, 1970). EBLOSUM62 scoring matrix with a gap penalty and extend penalty of 10 and 0.5 were selected during the process.

Statistical analysis
Statistical analysis was performed via GraphPad Prism version 8.0.0 (GraphPad, California, USA) and Minitab version 19 (Minitab Inc, Pennsylvania, USA). All simulations were performed at least in triplicate and the results were expressed mean ± standard error of the mean, where feasible. One-way analysis of variance (ANOVA) coupled with post-hoc Tukey test at 95% confidence interval and p < 0.05 was selected as the statistical measurement to determine the dynamics of the protein structures.

Molecular docking and molecular dynamics
simulations of the binding of apo-lactoferrin and its N and C-lobe derivatives to the receptor binding domain of SARS-CoV-2: Structural stability and qualitative analyses Figure 1A compares the RMSD values of a multimeric complex, composed of apo-LF and each of the RBD, coming from the native and mutated variant. It was found the protein structure, made up of apo-LF and the mutated RBD did not form a stable complex, in contrast to the one that consists of apo-LF and the native RBD. Whilst the RMSD value for the protein complex that involves the mutated RBD reaches its peak at 70 ns with a value of approximately 4 nm, the one for the native variant has as steady deviation, making up about 0.3 nm throughout the simulation run. Figure 1B illustrates the RMSD values of a combined protein structure, comprising the N-lobe and the respective RBD. Both graphs show high values of the RMSD throughout the simulation time, which is indicative for heavy mobility. While the RMSD values for the protein complex, involving the N-lobe and the native RBD remain around 1 to 1.5 nm, the ones for the Nlobe and mutated RBD is slightly higher, accounting for approximately 2 À 3 nm. Figure 1C shows the average RMSD values for protein complexes that were made up of the Clobe of apo-LF and the native as well as the mutated RBD. The data revealed that the addition of the native RBD to the C-lobe causes a degree of deviation to the overall tertiary structure, resulting in a fluctuating trend from about 0.4 to 0.6 nm for the first 70 ns. Conversely, the macromolecular assembly, comprising the C-lobe and the mutated RBD is stable, as demonstrated by the relatively low equilibration of the RMSD trend at 40 ns, with values that are about 0.35 À 0.4 nm. To verify the results from the preliminary analyses, separate RMSD graphs, comprising the proteins in their individual form were also computed (refer to Figures S1A, B, C, D, E, & F). When they are not bound together, the proteins, coming from lactoferrin and the RBD of SARS-CoV-2 have a mild deviation compared to the combined structures. This is particularly apparent for the N-lobe that was attached to both RBDs as well as the full-length apo-LF structure, bound to the mutated RBD, which indicates their instabilities. In general, the preliminary RMSD analysis, combined with the ANOVA and post-hoc Tukey test shows that while the Nlobe are not able to form complexes with both variants. On the contrary, the full-length structure of apo-LF is only able to form a protein complex with the native RBD. Lastly, the Clobe can form complexes with both variants.
To obtain details regarding the morphological changes, including the compactness and levels of the water penetration to the surface of the protein, the R g and SASA of the multimeric complexes were calculated. Figure 2A illustrates the average 100 ns of the R g values for protein complexes formed from apo-LF and each of the RBD. The R g graph for the complex, composed of apo-LF and the native RBD has steady trend with a value of about 2.7 nm, whereas the graph, demonstrating the R g value for the complex, made up of the former and the mutated RBD has a fluctuating trend. It reaches its peak at about 4 nm at the end of the 100 ns simulation time. Figure 2B compares the R g values for protein complexes, involving the N-lobe and each of the RBDs. The two-line graphs exhibited an overlapping motion. Whilst the R g values of the complex, comprising the N-lobe and the native RBD keep steady at about 2 nm for the first 20 ns before alternating between 2.2 and 2 nm for the remainder of the simulation time, the R g values for the multimeric complex, composed of the former and the mutated RBD begin at a lower value of about 0.17 to 0.18 nm for the first 20 ns. Afterwards, it continues to increase until it reaches 2.5 nm, followed by an oscillating movement in between 2.2 and 2.5 nm. Figure 2C highlights the R g for the C-lobe that binds to the native and mutated RBDs. The R g for the complex, involving the former and the native RBD has a higher value throughout the simulation time, which makes up from 2.5 to 2.7 nm, whereas the R g for the protein complex, comprising the former and the mutated RBD has far lower R g values, accounting for approximately 1.9 nm. Overall, the calculation is in congruent with the RMSD values where in general, the C-lobe could form complexes with the both variants, followed by the full-length structure of apo-LF, which interacts with the native strain. These are demonstrated by the maintenance of the overall shape of the protein complexes throughout the 100 ns simulation run. Figure 2D illustrates the SASA graph for the full-length apo-LF structure that was attached to the RBD of both variants. The SASA for both complexes are stable throughout the simulation time. The average areas where water can penetrate to each of the protein complexes are 390 and 410 nm 2 , respectively. Figure 2E shows the SASA graphs for the N-lobe combined with the native and mutated RBDs. The graphs show similar values; each with a SASA value of about 250 nm 2 throughout the simulation time. Figure 2F compares the SASA values of the protein complex, consisting of the C-lobe and the mutated as well as the native RBDs. Both graphs demonstrated a steady upward trend for the first 40 ns before equilibrating at 60 ns, which results in an overall SASA of 240 and 260 nm 2 , respectively for the protein complex, involving the mutated RBD with the C-lobe as well as the native RBD with the C-lobe. In essence, the instability projected by the RMSD and R g are not reflected by the SASA trends. The SASA for the full-length structure apo-LF was bound to the native and mutated RBDs is higher compared to those of the N-lobe and C-lobe, which are attached to either of the RBDs due to larger surface area available for water to penetrate. Whilst the average SASA for the N-lobe that was bound to each RBD as well as the C-lobe, which was attached to the native RBD, showed an average value of about 250 nm 2 , the SASA trends for the full-length apo-LF structure and C-lobe boudn to the mutated RBD have slightly lower SASA values of approximately 390 and 240 nm 2 . This is consistent with the R g calculations, where, intuitively, the protein complex that Figure 1. Stability of apo-lactoferrin, the N-lobe, and the C-lobe complexed with the receptor binding domain of SARS-CoV-2. The root mean square deviation values of the multimeric complex of apo-lactoferrin (A) with the native (blue) and mutated (red) receptor binding domains. The root mean square deviation values of the multimeric complex of the N-lobe (B) with the native (blue) and mutated (red) receptor binding domains. The root mean square deviation values of the multimeric complex of the C-lobe (C) with the native (blue) and mutated (red) receptor binding domains.

Figure 2.
Morphological stability of apo-lactoferrin, the N-lobe, and the C-lobe complexed with the receptor binding domain of SARS-CoV-2. The radius of gyration of the multimeric complex of apo-lactoferrin (A) with the native (blue) and mutated (red) receptor binding domains. The radius of gyration of the multimeric complex of the N-lobe (B) with the native (blue) and mutated (red) receptor binding domains. The radius of gyration of the multimeric complex of the C-lobe (C) with the native (blue) and mutated (red) receptor binding domains. The solvent accessible surface area of the multimeric complex of apo-lactoferrin (D) with the native (blue) and mutated (red) receptor binding domains. The solvent accessible surface area of the multimeric complex of the N-lobe (E) with the native (blue) and mutated (red) receptor binding domains. The solvent accessible surface area of the multimeric complex of the C-lobe (F) with the native (blue) and mutated (red) receptor binding domains. Table 1. Comparison of the average energy contribution of the binding of selected apo-lactoferrin forms to the receptor binding domain of SARS-CoV-2 (kJ/mol).
In order to extend the molecular insight at a residue level, the RMSF of the associated apo-LF structure, the N-lobe, and the C-lobe structures were computed (Figure 3). A similar yet fluctuating trend was observed for the RMSF curves for both apo-LF that was bound to both RBDs ( Figure 3A). Specifically, this includes several electrostatic driving residues, including Lys85, His420, Glu510, Arg654 and Arg689. Moreover, an extreme behaviour was observed for residues in the antibacterial regions as well as other residues N-lobe that interact with both RBDs, including the native and mutated variant (Bellamy et al., 1992;van der Kraan et al., 2004) (Figure 3B). High values of about 1.0 and 1.5 nm were projected by the curve, depicting the interaction of the N-lobe with the native and mutated RBDs, individually. Conversely, the RMSF values for residues in the C-lobe that binds to each of the RBDs ( Figure 3C). This was demonstrated the relatively minimum fluctuation of the residues, except for His420, Arg654, and Arg689; each with a RMSF value of approximately 0.4, 0.6, and 0.5 nm. This result is consistent with the RMSF values for the full-length structure, and therefore shows that the basic residues within the C-lobe may not interact favourably with both RBDs. The findings indicate that the RMSF calculations are consistent with the RMSD and R g trends, in which the Clobe of apo-LF can form favourable interactions with the RBD of SARS-CoV-2, whereas the full-length structure can only form interactions with the native RBD, leaving the Nlobe, which could not form a stable complex with any of the RBDs. Overall, MD simulations highlight that the initial structure, obtained from the protein-protein docking (refer to Figure S2, Appendix A, Supplementary data), do not depict a stable complex, as demonstrated by the above RMSD, R g , SASA, and RMSF of each protein constituent. These calculations have been verified further using statistical analyses, comprising ANOVA and post-hoc Tukey test. These instabilities also include the interaction of the N-lobe with both RBDs as well as the full-length structure of apo-LF with the mutated RBD.
Simulations were also visualised via VMD to obtain further details of the results. Six snapshots of these simulations, which were obtained from the final frame of each trajectory, are depicted in Figure 4 (Refer to Figures S2 and S3 in Appendix A, Supplementary Data to illustrate the complexation and unbinding processes). At 0 ns, docking calculations initially showed the possible formation of stable conformations for all complexes, indicating potential binding capacity of the full-length structure of apo-LF, the N-lobe, and C-lobe to both RBDs. Nevertheless at 50 ns, the unbinding between some complexes was seen. Whilst the C-lobe was able to adhere to each RBD, the full-length structure of apo-LF could only attach to the native RBD. In contrast, the N-lobe could not form a multimeric complex to any of the RBDs. This trend was confirmed at 100 ns, where it is evident that the complete structure of apo-LF could only bind to the native RBD ( Figure 4A and D), whereas the C-lobe of apo-LF could form a stable complex with both forms (Figure 4C and F). This leaves the N-lobe of apo-LF that could not bind to any of the RBDs, as shown by the separation on the picture ( Figure 4B and E). In addition, mean distance and number of contacts analyses were computed to further support the observation. The distance between the N-lobe and the native RBD is the farthest at 0.6 nm, whereas the distance of the complete structure of apo-LF and C-lobe structures to the latter are approximately the same at 0.2 nm ( Figure 5A). This is supported by the number of contacts analysis where the N-lobe forms the least contacts with native RBD, accounting for approximately 2000, followed by the full-length structure and the C-lobe of apo-LF, each with a corresponding number of 4500 and 3500 ( Figure 5B). One plausible reason why the contacts between the full-length structure of apo-LF and the native RBD exceed the contacts between the C-lobe and the latter is that there are more surface areas available for the proteins to interact (see Figure 4A); however, this does not necessarily demonstrate that the interaction is favourable. Slightly different results were seen for the interaction between the mutated RBD and the apo-LF, including the Nlobe and C-lobe. A downward trend was observed for the distance analysis, whereas the opposite occurred for the number of contacts analysis. The separation between the full-length structure of apo-LF and the mutated RBD is the longest, with a value of nearly 2.0 nm, resulting in the least number of contacts, accounting for less than 500 ( Figure 5C and D). Conversely, the mutated RBD was able to form a complex with the C-lobe, resulting in minimum distance of less than 0.5 nm as well as increased contacts up to 5000. The number of contacts between the N-lobe and the mutated RBD is lower compared to the C-lobe with the latter, making up about 2000. This is primarily due to the increased distance between the two of up to 0.6 nm. In general, the distance and number of contacts analyses show that C-lobe can maintain a short proximity with either of the RBD, thereby, forming continuous contacts with both RBDs. This is followed by the full-length structure of apo-LF, which also stays at a minimum vicinity from the native RBD and maintain steady contacts with the latter, leaving the N-lobe, which could not effectively adhere to any of the RBDs.

Free energy binding, hydrogen bond, and sequence alignment analyses: the C-lobe of apo-lactoferrin dominates the interaction with the receptor binding domain via electrostatic attraction
MM-PBSA calculations were conducted on MD simulations in order to estimate the free energy, involved in the binding (Table 1). In general, the free energy binding analysis indicated that electrostatic and van der Waals interactions are two of the predominant forces that drive the two proteins together. Binding energy values of À216 and À211 kJ/mol were obtained for van der Waals and electrostatic contributions, respectively for the interaction between the C-lobe and the wild type of RBD, whereas approximately À391 and À908 kJ/mol of energies were obtained, individually for the interaction between the former and the mutant. While polar solvation of energy impedes the formation of a complex between the C-lobe and both of the RBDs, with each energy equal to 412 and 1183 kJ/mol, SASA energy, on the other hand, assists the binding process, as shown by the negative values, accounting for À30 and À55 kJ/mol, respectively. In total, the enthalpy of binding of the C-lobe to native and mutated RBDs is À45 and À171 kJ/mol, individually. Nonetheless, a weak complexation was found for the binding between the full-length structure of apo-LF and the RBD of the native strain of SARS-CoV-2. This is primarily due to high polar solvation of energy, making up 705 kJ/mol, which outweighs the favourable van der Waals, electrostatic, and SASA interactions (-471, À15, & À64kJ/mol, respectively). Therefore, the sum of the total binding energy is 155 kJ/mol, demonstrating non-spontaneous binding between the two constituents as energies are required to assist with the binding. Overall, MM-PBSA calculations coupled with the aforementioned statistical analysis highlight the favourable complexation occurred between the C-lobe and both RBDs, whereas a weak binding is exhibited between the full-length structure of apo-LF and the native RBD.
Furthermore, to complement the results obtained from MM-PBSA calculations, hydrogen bond analysis was conducted to compute the interaction between the milk constituent and the respective RBD ( Figure 6). Whilst the average number of hydrogen bonds forming between fulllength structure of apo-LF and the native RBD ranges from about 3 to 12 ( Figure 6A), the hydrogen contacts, adjoining the C-lobe and the native RBD fluctuate from 2 to 12 ( Figure  6B), leaving the high proportion of hydrogen bonding from 7 to 17 that inadvertently links the C-lobe and mutated RBD ( Figure 6C). The above MM-PBSA and hydrogen bond calculations are supported with the results produced by EMBOSS Needle pairwise sequence alignment, as shown in Figure 6D and E. When apo-LF was aligned with the native RBD, there are four distinct regions in the C-lobe that demonstrated a degree of similarity. Specifically, this is from residue Thr474 to Ser484 (red); Lys494 to Phe565 (blue); Gln581 to His591 (green); and Leu613 to Leu668 (yellow). Similarly, there are also four differing regions in the C-lobe of apo-LF that was aligned with the mutated RBD. These are from Gln473 to Ser484 (red); Lys494 to Phe565 (blue); Gln581 to His591 (yellow); and Leu613 to Glu660 (green) (Refer to Figure S4 for the complete result of the pairwise sequence alignment).

Leading residues of the C-lobe in apo-lactoferrin, which bind to the receptor binding domain of SARS-CoV-2
According to Figure 7A and B, acidic and basic amino acid residues are the primary elements in the respective proteins that contribute to the binding. This is demonstrated by the high negative energy values of the negatively charged amino acid residues in the C-lobe of apo-LF, including Asp390 ($-30 kJ/mol), Asp602($-25 kJ/mol), Glu658($-25 kJ/mol), and Glu682($-25 kJ/mol) that interact with the positively charged amino acid residues, such as Lys424($-20 kJ/mol) of the RBD of the native SARS-CoV-2. Similarly, the interaction between the C-lobe of apo-LF and the mutated RBD is sustained by the contribution of acidic amino acid residues in apo-LF and basic amino acid residues in the RBD. Predominantly, these include Asp378, Asp462, Asp484, Glu682, Arg403, and Arg408 each with an energy value of approximately À50, À55, À60, À50, À50 and À45 kJ/mol. Nevertheless, the positively charged residues of the C-lobe counteract the binding process. For the C-lobe, which interacts with the RBD of the wild type, these consist of residues Lys404 ($30 kJ/mol), Arg600 ($40 kJ/mol), and Arg654 ($30 kJ/mol). Likewise, the residues, including Lys386 ($60 kJ/mol), Arg463 ($50 kJ/mol), Lys498 ($55kJ.mol), and Lys473 ($65 kJ/mol) of the C-lobe that interact with the mutant also hinder the formation of the protein complex. In addition, the acidic amino acid residues of the RBDs have the opposite effect on the binding capacity between the two proteins. For instance, residues Asp420 of the native RBD as well as Asp405 of the mutated RBD do not support the complexation. The energy absorbed by these residues are about 30 and 50 kJ/mol, respectively. These data demonstrate that overall the negatively charged amino acid residues of the C-lobe of apo-LF bind favourably to the positively charged amino acid residues of each RBD, resulting in the formation of stable protein complexes. Molecular graphics, portraying the interaction between the two proteins were also drawn to visualise the key adjacent residues (Figure 8). The findings highlight electrostatic interactions as the main interaction between the protein constituents, including the negatively charged residues of the C-terminal and positively charged residues of the two RBDs. The primary neighbouring residues, binding the C-lobe with the native RBD include Glu664, Glu658, Glu682, Asp602, Arg457, Lys462, Arg466, Lys424, Lys462, and Arg466. Nevertheless, as elucidated in the per-residue decomposition analysis, the basic amino acid residues within apo-LF, including Arg603, Lys604, and Lys608 do not support the binding. Interestingly, there are also some aromatic residues that contribute to the binding, comprising Tyr453 and Tyr505. Likewise, the interaction between the C-terminal of apo-LF and mutated RBD is fundamentally maintained by the contribution of acidic amino acid residues of the C-terminal, comprising Glu485, Asp378, Glu682, and Asp405 as well as the basic amino acid residues of the RBD, including Arg403, Arg454, Arg452, and Arg509. Conversely, the acidic amino acid residues in the RBD and the basic amino acid residues in the C-lobe impede the interaction. These residues include Asp442, Lys485, Lys674, and Lys404. There is also one residue, Leu502 that shows slight favourable binding contribution to the formation of the protein complex. When comparing the two structures of the conjoined proteins, the C-terminal of apo-LF binds more tightly to the mutated RBD than the native RBD, resulting in a more compact structure of the former compared to the latter. In general, the findings indicate that the mutation from leucine to arginine at position 452 results in a more firmly bound structure compared to the multimeric structure, composed of the native RBD and the C-lobe.

Discussion
The purpose of this study was to investigate the interaction of apo-LF, along with its derivative forms, specifically the N and C-lobe with the RBD of SARS-CoV-2 S protein. Generally, the findings of this study demonstrate the capacity of apo-LF, particularly the C-lobe to interact with the RBD of both the native and mutant. Firstly, the RMSD relative to the initial position of the protein was employed as an analysis tool to gain an overview of the structural stability of the system throughout the simulation. Based upon this data, the interaction between the C-lobe with both the native and mutated RBDs is the most stable one, followed by the interaction of the full-length structure of apo-LF with the native RBD. Furthermore, there are elevated RMSD values obtained for the full-length structure of apo-LF compared to those of the N and C-lobes. These elevated RMSD values are consistent with the higher molecular size of apo-LF compared to the N and C-lobe as well as compactness and the shape of the generated complexes as demonstrated by the R g values of each complexes formed (Sargsyan et al., 2017). This calculation has also been verified by the ANOVA and post-hoc Tukey test, where it was found the structural deviation of all the protein complexes is statistically significant. Nonetheless, some of the generated protein complexes are qualitatively unstable, especially the full-length apo-LF bound with the mutant variant as shown by not only the splitting of the RMSD of Figure 1A. This instability is also present in the simulation of the binding of the N-lobe to either of the RBDs. Conversely, when the multimeric structure of each complex was decomposed to the individual monomer, the results showed that each structure that makes up the whole protein complex is stable. Therefore, it is shown that some regions in apo-LF may not be compatible or form favourable interactions with the RBD of either variant. Overall, the Clobe of apo-LF is the promising region that could bind to the RBD of SARS-CoV-2 S protein.
Despite the distinct results obtained from the initial RMSD analysis, the levels of water penetration remain relatively constant, as demonstrated and described by the relatively stagnant SASA values. The SASA values for the full-length apo-LF as well as the C-lobe that form complexes with the mutated RBDs are moderately lower compared to the former that bind to the native variant. These trends are demonstrated by the R g calculation, where the generated multimeric structures associated with the mutants have a more compact structure compared to the corresponding proteins that were bound to the wild type. In general, these preliminary analyses highlight that the C-lobe of apo-LF can form a stable complex with both native and mutated RBDs of SARS-CoV-2, whereas the N-lobe of apo-LF is unable to form a stable complex with the latter, leaving the full-length structure of apo-LF that allegedly form a protein complex with the native RBD. This also reflects the importance of performing MD simulations after conducting molecular docking as simulation trajectories may reveal changes in inter-molecular interactions under a dynamic environment.
To complement the information obtained from the RMSD as well as to understand the statistical yet significant differences, the RMSF were measured to understand the specific per-residue changes of the proteins. It is evident that the results obtained for this analysis support the findings from the RMSD. The difference levels of fluctuation of each residue of the proteins in the unbound and bound forms result in dynamic changes of the tertiary structure of the proteins involved, thereby generating different RMSDs, as verified by ANOVA and post-hoc Tukey test. In addition, at residue level, there were strong fluctuations exhibited throughout the simulation for any RBDs that was bound to the N-lobe, indicating its volatility when binding to either of the RBD of the SARS-CoV-2. In contrast, the C-lobe has been demonstrated to form a stable complex by binding to the domain of both RBDs, as shown by the relatively low RMSF values. This is in contrast to the interaction of the full-length structure of apo-LF with the RBD where it can only effectively bind to the wild type and not the mutant, indicated by inspection of final snapshots of the molecular graphics, depicting each interaction coupled with the distance and contact analyses, highlighting the detachment of the multimeric complex. Despite the relatively low RMSF values for the full-length structure of apo-LF that was bound to either RBDs, the resulting simulations predict the incompatibility of the fulllength structure of apo-LF with the mutated RBD. Overall, based upon the above calculations, statistical analysis, and visualisation of the molecular structures, the findings demonstrate the formation of multimeric stable protein structures, comprising the C-lobe and the RBD of the wild type and the mutant as well as the full-length structure of apo-LF with the native RBD. On the other hand, a stark unbinding process was exhibited for the interaction involving the N-lobe and both RBDs. This behaviour was also seen for the interaction of the full-length structure of apo-LF with the mutated RBD. In general, these results confirm the instability of the N-lobe in binding to any RBDs as well as the full-length structure of apo-LF bound to the mutated RBD.
It is interesting to note that although the above qualitative analyses highlight the formation of stable multimeric structures, they do not necessarily indicate favourable binding. Specifically, MM-PBSA calculations ascertained that due to elevated polar solvation energies, a weak complexation process, involving the full-length structure of apo-LF and the native RBD was established. This is likely due to the presence of the highly positively charged region in the full-length structure of apo-LF, which offset the favourable complexation between the C-lobe inside the full-length structure and the RBD. This is in contrast to the formation of the other two multimeric protein structures as the results and statistical analysis corroborate the calculations obtained from the above calculations regarding the spontaneous formation of protein complexes. Specifically, it was found that the C-lobe of apo-LF could bind with either the wild type or the mutant RBD. This is primarily due to the presence of basic amino acid residues within the RBD that bind to the acidic amino acid residues of the C-lobe as well as the slight contribution of other non-polar amino acid residues that form van der Waals interactions. Calculations obtained from the number of hydrogen bonds are also congruent with the above free energy calculations. Due to the presence of higher number of hydrogen bonds, stronger binding of the C-lobe to the mutated RBD was exhibited compared to the binding of the C-lobe to the native RBD. Nevertheless, despite the weak complexation between the full-length structure of apo-LF and native RBD, there is a relatively stable number of hydrogen bonds that sustain the contact between those two components. Accordingly, this also explains the formation of the multimeric complex, as illustrated in Figure 4A. Sequence alignment also supports the above MM-PBSA calculations as the alignment demonstrated that a substantial number of residues in the C-lobe showed similarity with the aligned regions of the native and mutated RBDs. In addition, the results of this calculation are also consistent with a previous study, demonstrating favourable van der Waals interactions between LF and the S protein of SARS-CoV-2 (Campione et al., 2021). Overall, sequence alignment, number of hydrogen bond contacts, the distance and number of contacts analyses, as well as MM-PBSA calculations, verified by ANOVA and post-hoc Tukey test illustrate the C-lobe is the predominant domain within apo-LF that interacts efficiently with both RBDs of SARS-CoV-2.
Finally, the per-residue free energy decomposition analysis supports the previous analysis regarding the contribution of electrostatic interactions as the key driving force for the complexation process. As expected, negatively charged residues of the C-lobe bind with the positively charged residues of each of the RBDs. Conversely, basic amino acid residues of the former as well as acidic residues of the latter reduce the enthalpy of formation. This calculation is consistent with the previous RMSF results, where it was demonstrated that the fluctuation of His420, Arg654, and Arg689 is higher than that of the other residues. Accordingly, this also explains why the RBDs could not effectively bind to the N-lobe of apo-LF. Due to the presence of highly positively charged peptides in apo-LF, comprising the lactoferricin and lactoferampin regions (Bellamy et al., 1992;Darmawan et al., 2020;van der Kraan et al., 2004), the interaction between apo-LF or particularly that of the N-lobe and the RBD of both the native and mutated type is likely to be unfavourable, or at least, far weaker compared to that of the C-lobe. There is a limited number of negatively charged amino acid residues required in those regions to allow effective binding into the RBD of SARS-CoV-2. Recent literature also suggests that the C-lobe of apo-LF is more potent in binding to the S protein of SARS-CoV-2 because of less glycosylation than that of the Nlobe (Miotto et al., 2021). Thus, the results of this study are supported by previous work regarding the C-lobe of apo-LF being the more effective domain in apo-LF that binds to RBD of the S protein of SARS-CoV-2 than the N-lobe. In addition, our results also shed light on the differences in interactions between apo-Lf with both the original and the delta variant mutant RBD, and suggests that apo-Lf may be more effective at inhibiting the function of the delta variant RBD, while previous studies on apo-Lf interactions with SARS-CoV-2 had not yet elucidated variant-dependent differences. One possible reason behind this enhanced binding is the transformation of one residue from leucine to arginine at position 452, which results in an additional positively charged residue that can form favourable interactions with the negatively charged residues of the C-lobe of apo-LF. Hence, the mutations, occurring on SARS-CoV-2 do not necessarily obviate the efficacy of compounds or in this case the C-lobe of apo-LF in attaching to the RBD of SARS-CoV-2. Depending on the mutations, small incremental changes may exert beneficial effects to allow successful binding, thereby, resulting in further competition with the ACE-2 receptor in interacting with the RBD of the S of SARS-CoV-2.
Overall, this study has applied extensive computational techniques to elucidate the conformation and the interaction among the proteins at the molecular level, which is difficult to achieve by traditional empirical studies. Future studies examining the interaction of other potential bioactive ingredients with the RBD of SARS-CoV-2 are encouraged to evaluate the specific segment of RBD that has direct binding capabilities to ACE-2 (Williams-Noonan et al., 2021). Other in vitro or in vivo studies, which could corroborate the results of work, are also recommended to further attest the findings, published in this study.

Conclusions and summary
In conclusion, our results have demonstrated the potential use of apo-LF, particularly the C-lobe to bind via electrostatic attraction to the RBD of SARS-CoV-2, thereby limiting the latter from binding to the ACE-2 receptor. There are also other minor non-polar amino acid residues, including Tyr453, Tyr505, and Leu502 that assist the process of complexation. When comparing the structure between the two generated complexes, MM-PBSA and MD analyses highlight that the mutation from leucine to arginine at position 452 results in a more compact structure than the wild type, indicating a stronger binding, as verified in the more negative value of the binding free energy, stable number of hydrogen bonds, pairwise sequence alignment. This illustrates that the mutation of the RBD may be advantageous for certain target compounds to interact favourably. In summary, the key recommendation from the present work is that the C-lobe of apo-LF, which may be obtained by pepsin cleavage, could be effective against the native and especially the delta variant.

Disclosure statement
We hereby certify that, to the best of our knowledge, no person(s) has any potential interest in the publication of this article, with regards to either the practical or writing aspects that may corrupt its authenticity. If any information indicating otherwise comes to light, then the relevant people will be contacted and procedures followed as outlined by RMIT University.