Modeling structural interconversion in Alzheimers’ amyloid beta peptide with classical and intrinsically disordered protein force fields

Abstract A comprehensive understanding of the aggregation mechanism in amyloid beta 42 (Aβ42) peptide is imperative for developing therapeutic drugs to prevent or treat Alzheimer’s disease. Because of the high flexibility and lack of native tertiary structures of Aβ42, molecular dynamics (MD) simulations may help elucidate the peptide’s dynamics with atomic details and collectively improve ensembles not seen in experiments. We applied microsecond-timescale MD simulations to investigate the dynamics and conformational changes of Aβ42 by using a newly developed Amber force field (ff14IDPSFF). We compared the ff14IDPSFF and the regular ff14SB force field by examining the conformational changes of two distinct Aβ42 monomers in explicit solvent. Conformational ensembles obtained by simulations depend on the force field and initial structure, Aβ42α-helix or Aβ42β-strand. The ff14IDPSFF sampled a high ratio of disordered structures and diverse β-strand secondary structures; in contrast, ff14SB favored helicity during the Aβ42α-helix simulations. The conformations obtained from Aβ42β-strand simulations maintained a balanced content in the disordered and helical structures when simulated by ff14SB, but the conformers clearly favored disordered and β-sheet structures simulated by ff14IDPSFF. The results obtained with ff14IDPSFF qualitatively reproduced the NMR chemical shifts well. In-depth peptide and cluster analysis revealed some characteristic features that may be linked to early onset of the fibril-like structure. The C-terminal region (mainly M35-V40) featured in-registered anti-parallel β-strand (β-hairpin) conformations with tested systems. Our work should expand the knowledge of force field and structure dependency in MD simulations and reveals the underlying structural mechanism–function relationship in Aβ42 peptides. Communicated by Ramaswamy H. Sarma


Introduction
Intrinsically disordered proteins (IDPs) are associated with neurodegenerative disorders such as Parkinson's disease, Alzheimer's disease (AD), prion disease, Huntington's disease, and type II diabetes. (Caughey & Lansbury, 2003;Chiti & Dobson, 2017;Han & He, 2018;Panza et al., 2019) Amyloid beta peptides belong to the class of IDPs associated with AD. At the neurophysiological level, the pathology of AD is characterized by the presence of insoluble, densely packed, and aggregated peptide fibrils in the brain. Before forming the amyloid plaques, the soluble and structured oligomers of the amyloid beta (Ab) peptides undoubtedly have neurotoxic effects (Cline et al., 2018;Lee et al., 2017;Qiang et al., 2017). Therefore, investigating the dynamics of the peptide is crucial to understanding the molecular processes that lead to cell toxicity as well as the action of small molecule inhibitors for AD (Zhang et al., 2017).
As compared with its alloform Ab40, Ab42 (Figure 1(A)) is more neurotoxic and aggregates faster in amyloid plaques (Cohen et al., 2015;Finder et al., 2010;Meisl et al., 2014). The N-terminal region (residues D1 to K16) is highly flexible in the whole sequence and has metal-binding ability under abnormal physiological conditions (Adlard & Bush, 2006;Liao et al., 2017;Maynard et al., 2005;Roberts et al., 2012). The region mainly with non-polar residues spanning L17 to A21 is known as the central hydrophobic region (CHR). This region has a close relation with the aggregation event (Liu et al., 2004). Following the CHR is the central polar region (CPR), which contains many hydrophilic residues (E22 to G29). The final C-terminal hydrophobic region contains more non-polar residues (A30 to A42). X-ray crystallography, solidstate NMR spectroscopy, cryo-electron microscopy and other approaches Kheterpal et al., 2000;Kirschner et al., 1986;Luhrs et al., 2005;Meng et al., 2018;Olofsson et al., 2007;Petkova et al., 2002Petkova et al., , 2006Schmidt et al., 2015;Whittemore et al., 2005) have been advanced to determine static structures with high resolution, but exploration of Ab peptide dynamics and the structural mechanism-function association remains elusive. Molecular modeling methods are powerful tools for sampling more conformations and investigating the biophysical mechanisms of the Ab peptide.
Molecular dynamics (MD) simulations are emerging tools used to sample atomistic details of structural conformations and dynamics of molecules not achieved with experiments. Over the past decades, studies have used different force fields and MD algorithms to understand the dynamics and conformations of Ab42, ranging from monomeric to fibrillar states (Alves & Frigori, 2018;Barz et al., 2021;Bhattacharya et al., 2020;Caliskan et al., 2021;Chakraborty et al., 2020;Grasso et al., 2019;Hashemi et al., 2019;Ilie & Caflisch, 2018;Krupa et al., 2019;Lin et al., 2019;Masutani et al., 2019;Olubiyi & Strodel, 2012;Robustelli et al., 2018;Sgourakis et al., 2011;Thu et al., 2019;Xu et al., 2005;Zhang et al., 2016). However, conventional force fields with parameterization used in MD studies were optimized for structured proteins to agree with experimental data or often involved quantum calculations of small molecules or short peptide fragments. Studies have focused on evaluating the performance of different force fields in the hope of finding suitable force fields for Ab simulation (Caliskan et al., 2021;Carballo et al., 2017;Gerben et al., 2014;Man et al., 2017Man et al., , 2019Rahman et al., 2020;Rosenman et al., 2016;Siwy et al., 2017;Smith et al., 2015). For instance, Man et al. evaluated the performance of 17 widely used atomistic MM force fields on a seven-residue peptide from AMBER, GROMOS, OPLS and CHARMM families and suggested that five force fields were good candidates for amyloid peptide simulations: two are AMBER99SB-ILDN and AMBER14SB and three are CHARMM C22 Ã , C36 and C36m (Man et al., 2019). Carballo-Pacheco et al. applied various AMBER-based OPLS and CHARMM22 Ã force fields with replica exchange MD (REMD) to investigate the dynamics of Ab42: CHARMM22 Ã reproduced better NMR observables for Ca and HN chemical shifts than the other force fields (Carballo et al., 2017). Rosenman et al. analyzed the ensembles of two isoforms (Ab40 and Ab42) and mutant forms of Ab40 by using REMD with three different force fields to understand the sequence-dependent dynamics (Rosenman et al., 2016). However, concluding on a universal force field for simulating amyloid peptides (Ab, tau and a-synuclein, etc.) is challenging because their structural preferences depend on the force field, simulation protocol and conditions used (Ilie & Caflisch, 2019;Nasica-Labouze et al., 2015;Nguyen et al., 2021;Strodel, 2021). MD simulations of IDPs using conventional force fields may incorrectly characterize the features of IDPs, thereby affecting the quantitative analysis and hiding realistic ensemble descriptions. Recently, IDP-specific force fields such as ff99IDPs (Wang et al., 2014), ff14IDPs , ff14IDPSFF , A99SB-disp (Robustelli et al., 2018), OPLSIDPSFF (Yang et al., 2019), CHARMM36IDPSFF (Liu et al., 2018), and ff99SBnmr2 (Yu et al., 2020) were developed to refine and improve the modeling of IDPs. Because these updated IDP-specific force fields were parameterized based on short coils and turn peptides to improve the backbone dihedral angles or were optimized to achieve agreement with disordered peptides, conformational ensembles obtained with these force fields reproduced the quantitative characteristics of IDPs and intrinsically disordered regions. Specifically, ff14IDPSFF showed improved performance in sampling various IDPs, with high corresponding values to the experimental observables (Dan & Chen, 2019;Duong et al., 2018;Rahman et al., 2020;. Krupa et al. used conventional MD (cMD) and REMD to study the dynamics of Ab42 by comparing the performance of ff14IDSPFF and CHARMM36m with their original versions, ff99SB and ff14SB, and CHARMM, respectively, and they have shown that ff14IDPSFF yielded similar results in REMD sampling but improved conformational sampling in cMD . Recently, Rahman et al, conducted MD simulations using various IDP-specify force fields, namely, ff99IDPs, ff14IDPs and ff14IDPSFF, to study two different initial structures of Ab42: one a-helix rich content (33.33%) Figure 1. (A) Sequence of 42 residues amyloid beta peptide (Ab42), (Red: polar, negative charge residues; Blue: polar, positive charge residues; Black: non-polar residues; Green: polar, neutral residues). (B) NMR-resolved structure of Ab42 a-helix (the 4 th model, PDB ID: 1Z0Q). (C) NMR-resolved structure of Ab42 b-strand (strand A of the 5 th model, PDB ID: 2NAO); blue sphere, N-termini, red sphere, C-termini. conformer and one model conformer (a-helix content 26.19%). The authors concluded similar results for these force fields (Rahman et al., 2020). The question remains as to whether and how to distinguish the initial conformation for such a flexible peptide to see a significant difference on computational study, or is there an interconversion event occurring within the same Ab conformer but different results observed with different tested force fields?
This study compared the newly developed but not extensively explored ff14IDPSFF force field and the regular ff14SB force field by examining the conformational changes of two totally different Ab42 monomers in explicit solvent. The initial structures are shown in Figures 1(B) and (C). We sampled the conformer ensembles by using cMD and accelerated MD (aMD) simulations to attain microsecond-timescale (total 15.6 ls) MD and performed post-analysis to access the structural stability, flexibility and secondary structure properties. We evaluated the performance of the force fields by comparing the predicted structures with NMR chemical shift values and accessed the radius of gyration (Rg) values from the simulations for comparison with other studies. We also revealed some characteristic features in the sequence that were found in common from our study. We observed clear differences with the two tested force fields in the same initial structures, and similar conformational features within Ab42 were revealed by using distinct initial conformations. Our work broadens the spectrum of force-field and initial structure selection and provides useful insights into the conformational changes in Ab42.

MD simulations
Two NMR structures (PDB ID: 1Z0Q (Tomaselli et al., 2006), monomer and PDB ID: 2NAO (W€ alti et al., 2016), fibril) were used in the simulations. In 1Z0Q, with an a-helix rich conformer, 30 conformations were deposited, and the 4 th conformation with extended structure was selected as our initial structure Ab42 a-helix for cMD runs. For 2NAO, with b-strand rich fibril, we selected the strand A of the 5 th conformation as our initial structure Ab42 b-strand for cMD runs. In Ab42 a-helix (Figure 1(B)), this peptide is mixed, with mostly helices spanning S8 to V24 and 3-10 helix spanning I31 to V36, connected with the coil structure. In Ab42 b-strand (Figure 1(C)), the b-strand segment spans D1 to D7, Q15 to F19, and V40 to A42, connected with coils, a turn and the isolated bridge structures. Table 1 lists all working systems and conditions according to the force fields and simulation method used.
In aMD, the initial conformation was obtained from the last frame of the cMD. All simulations were performed with 3 random number seeds noted as (1, 2, 3) in the simulation index column. All MD simulations were performed with the Amber 18 simulations package (Case et al., 2018), Amber ff14SB and ff14IDPSFF  force fields were used for all working systems. To use ff14IDPSFF, the regular topology file with optimized grid-based energies (CMAP) for all 20 amino acids was converted to the new topology file (Wang et al., 2014). A standard Amber protocol was used. Na þ ions were added to neutralize the systems. The following minimization steps were performed for the hydrogen atom, side chains, and the entire protein complex. Then, the systems were solvated with a rectangular (orthogonal) TIP3P water box (Jorgensen et al., 1983) with the cutting edge of box 12 A˚away from the solutes. Minimization for the water molecules followed by the entire system was performed to correct any inconsistency. The water molecules with solutes fixed at 300 K were equilibrated in the NPT ensemble, followed by heating the systems with an increment of 25 K temperature up to the final 300 K. This step was used to fully relax the system before entering the production run. Long-range electrostatic interactions were computed by the particle mesh Ewald method (Essmann et al., 1995). The SHAKE algorithm was used to constrain the bonds involving hydrogen atoms (Ryckaert et al., 1977). The Langevin thermostat (Adelman, 1979;Doll & Dion, 1976) with a damping constant of 2 ps À1 was used to maintain a temperature of 300 K. The production run in NPT ensemble had a 2-fs time step. cMD with a 1-ms production run for each working system was simulated for 3 seeds. We collected the resulting trajectories every 1 ps and re-saved the frames for analysis at intervals of 100 ps.
To further enhance the samplings of the monomers, all working systems underwent aMD simulations (Hamelberg et al., 2004). Briefly, cMD simulations were run at 10 ns to obtain dihedral and total potential energies. To simulate aMD with different boost potentials, we calculated four parametersenergy threshold potential (EthreshP), boost energy potential (alphaP), energy threshold dihedral potential (EthreshD), and boost energy dihedral potential (alphaD)based on the established protocol (You & Chang, 2018), and different boost potentials (a ¼ 0.2, 0.5 and 1.0) were applied in the alphaD parameter to each working system. All aMD simulations were performed at 100 ns for 3 seeds (Table 1) under the same protocol and conditions run by cMD. We used Define Secondary Structure Protein (DSSP) and conformational clustering, for all aMD runs.

Post-analysis
We used visual MD (VMD) (Humphrey et al., 1996) for visualization. Cpptraj implemented in AmberTools18 (Case et al., 2018) was used to calculate the root mean square deviation  (RMSD), root mean square fluctuation (RMSF), radius of gyration (Rg), solvent accessible surface areas (SASA), and intra-peptide contact map. Chemical shifts were downloaded from the BioMasRes Database (BMRB) (Ulrich et al., 2008) (BMRB ID: 6554 for Ab42 a-helix and BMRB ID: 26692 for Ab42 b-strand ). The chemical shifts were estimated by using the SHIFTX2 program (Han et al., 2011) and results were compared with the NMR data. We used Multiscale Modeling Tools for Structural Biology (MMTSB) (Shao et al., 2007) with a K-means clustering program based on an RMSD cutoff of 4.5 Å to proceed with conformational clustering. The secondary structures for Ab42 were assigned based on the DSSP algorithm (Kabsch & Sander, 1983) implemented in VMD. The salt bridge distance measurement with a cutoff of 4.0 Å was calculated in VMD.

Results and discussion
Assessment of overall stability and flexibility of peptide Ab42 To examine the stability and flexibility of all working systems, we first calculated the Ca atomic RMSDs ( Figure S1). Figure   Figure 2. Continued. S1(B) illustrates a larger conformational adjustment with ff14IDPSFF simulations initiated from Ab42 a-helix as compared with ff14SB simulations ( Figure S1(A)). For Ab42 b-strand simulated by the two force fields, ff14SB (Figure S1(C)) conferred a smaller average conformational fluctuation (RMSD values from 10 to 14 Å) as compared with ff14IDPSFF ( Figure S1(D)) (RMSD values from 6 to 14 Å), which is similar to the range 5 to 15 Å observed by Krupa et al. (2019) who used cMD with the same IDP force field. Structured proteins typically fluctuate with less dynamics (RMSD < 3 Å), whereas in Ab42, the RMSD values ranged from 6 to 18 Å, which indicates that it unfolded and refolded during the simulations. To further monitor the convergence of all working systems, we calculated the cumulative clusters in a time-dependent window. For each 100-ns time window, the cluster of conformers was calculated ( Figure S2). For simulations of Ab42 a-helix , the overall number of clusters increased more with ff14IDPSFF than ff14SB. A similar trend occurred with simulations of Ab42 b-strand , with a higher number of clusters sampled with ff14IDPSFF than ff14SB. Notably, the cluster numbers of all working systems reached a plateau toward the end of the simulations, which indicates the convergence of the conformation sampling. However, existing free energy landscape studies suggested that the intrinsically disordered Ab42 has more distinct minima than does a typical globular protein (Zheng et al., 2016. Our MD runs did not likely sample all energy minima. The conformations may be kinetically trapped in an energy funnel (Jia et al., 2020), and enhanced sampling techniques, such as REMD simulation with a set of parameters (temperature, number of replicas, exchange rates, etc.) need to be applied to overcome the energy barriers. As a result, to sample more conformations, we also used aMD to obtain additional conformations.
To illustrate the overall flexibility of the evaluated Ab42 monomers, we computed the Ca RMSF ( Figures S3(A) and S3(B)). The central hydrophobic core region (gray bar) was more stable than the N-and C-termini. The first 10 residues (or the metal binding region) are sometimes missing in other structures, but unexpectedly, we did not observe that this region was more flexible than the C-terminal region. Fluctuation in simulations was high with ff14IDPSFF for both monomers as compared with ff14SB. This observation can explain the relatively high ratio of disordered conformations simulated with ff14IDPSFF.
Comparison of force field performance with experimental data    al., 2016) and the predicted structures is a robust and straightforward method to evaluate the performance of force fields. With the available chemical shift residue dataset, we compared the HN and Ca using simulations from Ab42 a-helix (Figure 2(A) and 2(B), respectively). The Ca, Cb and N from simulations of Ab42 b-strand were calculated and compared in Figure 2(C)-(E), respectively. The difference in mean RMSD values for the observables between the experimental data and the predicted structures are in Table 2. For both Ab42 ahelix and Ab42 b-strand monomers, the simulation results and experimental data showed the same trend with the two tested force fields (Nguyen et al., 2021;Tomaselli et al., 2006;W€ alti et al., 2016) with ff14IDPSFF providing better results for NMR observables except for Ca in Ab42 a-helix . Sampling was improved with the regular ff14SB for Ca in Ab42 a-helix . Similar results were observed in other computational simulations Rahman et al., 2020). For the Ca chemical shift in Ab42 a-helix , the calculated value deviated from the experimental data with ff14IDPSFF simulations (Figure 2(B)). The difference in RMSD was calculated as 0.928 ppm with ff14IDPSFF versus 0.194 ppm with ff14SB. Such a deviation in ff14IDPSFF corresponds to the parameter optimized by fitting with small peptides and coils, whereas for ff14SB, the favorable helicity is obvious when simulating a helical rich conformer by this force field. This was not the case for the Ca chemical shift in Ab42 b-strand (Figure 2(C)). We observed better results in agreement with the experimental data with ff14IDPSFF, with a difference in RMSD of 0.518 ppm versus 0.941 ppm with ff14SB. The difference in RMSD value for the Cb chemical shift indicated that the calculated values under ff14IDPSFF simulation are closer to the experimental data (0.364 ppm with ff14IDPSFF vs 0.505 ppm with ff14SB). Figure  2(E) shows that the predicted data for the N chemical shifts for both tested force fields are quite large in simulating Ab42 b-strand . The difference in RMSD was higher with ff14IDPSFF than ff14SB (0.666 vs 0.564 ppm).
To further elucidate the compactness and measure the sizes of our simulated Ab42 structures, we plotted the distribution with coordinates of Rg and RMSD (Figure 3) for the whole trajectories. For Ab42 a-helix simulations, the distribution of structural compactness with ff14IDPSFF (Figure 3(B)) ranged from 8.2 to 23.8 Å for Rg, whereas the distribution with ff14SB (Figure 3(A)) ranged from 8.2 to 22.1 Å for Rg. Also, another distribution of high probability represented a different conformation ranging from 18.0 to 20.0 Å for Rg. This observation indicates a larger populated structural distribution and diverse conformations with ff14IDPSFF. However, simulations starting from Ab42 b-strand showed wider dispersed regions, with mean Rg values ranging from 8.8 to 17.2 Å with ff14SB and 8.2 to 19.8 Å with ff14IDPSFF ( Figure  3(C) and (D), respectively). Ab42 a-helix simulated by ff14IDPSFF was able to create more disordered and small structural compactness and flexibility. The calculated Rg values for both the Ab42 a-helix and Ab42 b-strand are in Table S1, (Chakraborty et al., 2020) and $2.0 Å lower than the value reported by Meng et al. (15.9 ± 0.1 Å) (Meng et al., 2018), whose results were shown in comparison with fluorescence resonance energy transfer results. The estimated average Rg values for the four systems in our study (11.3 ± 2.0 Å, 13.4 ± 4.0 Å, 10.9 ± 1.0 Å, and 11.2 ± 1.4 Å) agree well with the values reported by Huy et al. (2016), Carballo et al. (2017, and Rahman et al. (2020).
To structurally characterize the conformational interconversion, we examined the secondary structure changes during our MD runs. The time-dependent contents of each secondary structure are in Figures 4 and 5, Figure S5 for cMD, and Figures S6 and S7 for aMD. The secondary structure content is reported in Figure S4.
Investigation of the transformation of the secondary structure in Ab42 to reveal the key features of the residues Ab42 a-helix : We analyzed changes in secondary structures during cMD with the two force fields. In general, ff14IDPSFF yielded more frequent interconversion between secondary structures than did ff14SB. For MD run A1 (1), on the C-terminal, the transition event was from a-helix to the coil and to the b-sheet within the residues M35, V36, V39, and V40, over $0 to 300 ns (Figure 4(A)). V36 transitioned from the turn (green) to the b-bridge (brown)/b-sheet (yellow), which lasted for $150 ns, and transformed back and forth from the turn to 3-10 helix (blue), lasting for $150 ns, before switching back to the b-sheet for another $200 ns. However, V39 and V40 transitioned from the turn/coil to the b-sheet over $0 to 300 ns and retained mostly b-sheets after $470 ns, lasting for $200 ns. G37 and G38 did not form a b-sheet but were more prone to alternate between the turn and 3-10 helix because of lack of side chains. Notably, G37 and G38 were in the coil conformation because they connected the hydrophobic residues in our simulation. Previous research indicated that G37A mutation can disrupt the b-sheet formation (Xu et al., 2005). These secondary structure transitions of the residues in the C-terminal and its longevity to be in that specific secondary structure may suggest that the region of the conformer is more susceptible to aggregation. In other words, the time it takes for the residue to morph from the a-helix to the b-sheet and how long the b-sheet conformation lasts within the MD run could imply how prone the residue is to aggregation.
In the MD run A1I (1) (Figure 4(B)), on the C-terminal, although M35-V40 is unique as compared with Figure 4(A), residues V36-G38 did not form the b-sheet conformation. Also, V39, V40 and I41 formed the b-sheet conformation later in the MD run, over $300 ns. The V39 and V40 longevity of the b-sheet in Figure 4(B) being similar to that in Figure 4(A) could imply that those residues are prone to aggregation.
To quantify the content of secondary structures that Ab42 produced in these simulations and gain more in-depth insight into the conformers, we calculated the percentage of helix (red, orange, gray), b-sheet (yellow)/b-bridge (green), turn (blue), and coil (dark blue) elements as a function of residues within the Ab42 sequence. In Figure S4(A) and S4(B), the MD run A1 (1) showed that helical elements were more prominent in simulations with ff14SB (red). Two segments with more than 50% helicity persisted in the region  E11 to V24 and extended from residues K28 to M35, with a small break in the G25, S26 and N27 residues having turn structures. This relatively high content of helical structure was shown in experimental studies (Kirkitadze et al., 2001;Vivekanandan et al., 2011) The prominent helical content was also observed at the same peptide regions in recent simulation studies (Caliskan et al., 2021;Carballo et al., 2017;Lincoff et al., 2019) The coil elements generally persisted toward the terminus of Ab42. The estimated percentage of helical/turn/coil content in this MD simulation was > 80%, which is similar to the result reported by Krupa et al. ($85%) using the same force field in cMD . Figure  S4(B) shows that MD run A1I (1) produced a short a-helix and 3-10 helix segments that were approximately 10% content in the region H13 to F20 (red and gray) but produced b-sheet elements toward the N-and C-termini. The overall appearance of the b-sheet was > 80% in the residues spanning I32 to M35, with a long strand appearing at regions Y10-H13 and A2-H6, possibly because of the bulky, alkyl side chain within these residues that therefore could correspond to the increased aggregation rate (Thu et al., 2019). The overall b-sheet content was estimated at $30% in the whole sequence, which was also observed in a recent study using the same force field  and in experiment findings (Kirkitadze et al., 2001;Ono et al., 2009). The central polar region appeared to be mostly coil and turn structures, with a few b-bridge structures at G25 and K28.
Ab42 b-strand : Figure 5(A) shows that MD B1(1) transitioned from turns to coils to b-strands. Taking a closer look at the M35-V40 region, M35 started as a turn conformation, then morphed into a coil within certain timeframes, then changed back to turns. That residue retained the coil secondary structure within the last $300 ns. V36 started as a turn conformation as well but quickly transformed into the coil secondary structure after $100 ns and mostly stayed as the coil structure for the rest of the MD run. G37 and G38 started as a  (1,2,3), respectively]. N-terminal region (N, residues 1 À 16), hydrophobic central region (CH, residues 17 À 21), central polar region (P, residues 22-31), and C-terminal region (C, residues 32 À 42).
turn secondary structure but morphed to a coil throughout the MD run. V39 and V40 were mostly a mixed form with b-bridges/b-sheets and coils throughout the MD run. These two residues transformed back and forth between b-sheets and b-bridges until the latter part of the MD run, when V40 transformed to coils for the last $300 ns. V39 and V40 being hydrophobic and more prone to forming and maintaining the b-sheets and b-bridge secondary structures may imply that these residues are prone to aggregation.
In Figure 5(B), M35-V40 showed a trend of transitioning from turns to b-sheets. M35 started as a turn conformation, then morphed into a b-sheet, which lasted for $100 ns. Then, M35 transformed into coils and turns for a short time until it transformed and sustained the b-sheet structure for $300 ns. Finally, it morphed back to turns and transitioned to b-sheets, which lasted about $150 ns. Likewise, V36 showed a pattern similar to that for M35 but with more small transitions to b-sheets at the beginning of the MD run. G37 and G38 started as a coil but transformed back and forth from turns to b-sheets during the MD run. V39 and V40 mainly started as b-bridges and coils, respectively, but had a similar pattern (Figure 5(A)). These residues sustained a $300-ns b-sheet structure (timeframe 450-730 ns), which suggests that the region was prone to aggregation. M35, in particular, was more prone to forming b-sheets later in the MD run, so it may be prone to aggregation.
Again, we further quantified the percentage of secondary structures by plotting the propensity of each residue. Residues V36 to I41 of the C-terminal showed less b-sheet content ( Figure S4(C)) and sparse content in the polar region spanning S25 to K28 in MD B1(1). The remaining residues in the sequence were dominated by helix, coil and turn structures. In Figure S4(D), MD run B1I (1) showed a high ratio of the b-sheet appearing mainly in the regions located at the N-and C-termini and central hydrophobic region. The high b-sheet content observed in our MD study was in good agreement with Rosenman et al. (2016) and Lincoff et al. (2019) but not with another study reporting helix and b-sheet contents of 11.93% and 10.6% with ff14IDPSFF, thus indicating a quantitative balance in helix and b-sheet content for their a-helix-rich conformer (Rahman et al., 2020). Previously, ff14SB was believed to overestimate helix content (Hornak et al., 2006;Krupa et al., 2019;Man et al., 2017), so this force field may simulate more helix conformers in Ab42 monomers, although we selected two distinct monomeric states. Force field favorability in secondary structure determination remains a substantial problem because the Ab42 simulation strongly depends on the force field selection and simulation protocol.

Mapping residue contacts to characterize the selfassembly regions
To characterize the intramolecular interactions within the monomers, we created intrapeptide contact maps to show the probabilities of the backbone contacts. These maps reveal the characteristic monomeric tertiary structures of Ab42 defined by four different regions: N-terminal region (N) (D1 À K16), hydrophobic central region (CH) (L17 À A21), central polar region (P) (E22-I31) and C-terminal region (C) (I32 À A42).
In our analysis, two residues were in contact on this map if their Ca À Ca distance was 7.5 A˚. Figure 6(A) and (B) compare the two force fields for Ab42 a-helix . With ff14IDPSFF simulation [ Figure 6(B), MD run A1I (1,2,3)], there was highprobability contact between residues A2-S8 and E11-Q15, residues G9-H14 and the CH region, the CH region and V31-M35, and the N and C regions. In contrast, with ff14SB simulation, there was sparse contact in these regions [ Figure 6(A), MD run A1 (1,2,3)], so the backbone contact may not be obvious in our simulation with the cutoff criteria. Also, the polar region P had the least contact with the two force fields, so this region could be very dynamic. Our findings agree with Rosenman et al. (2016) who calculated the tertiary structure with OPLS and AMBER99sb-ILDN for Ab42 and CHARMM22 Ã for Ab40. Figure 6(C) and (D) compare the two force fields for Ab42 b-strand . Similar to A1I, as shown in Figure  6(D), the MD run B1I (1,2,3) shows a high probability of residue contact between residues A2-S8 and E11-Q15, residues G9-H14 and the CH region, N and P regions, CH and P regions, CH and C regions, and residues E11-Q15 and the C region. The high contact ratio between residues could lead to the increased breadth of the b-sheets after in-register stacking within this monomer. With ff14SB simulation, in the b-strand shown in Figure 6(C), MD run B1 (1,2,3), we did not observe close contact between N and P regions; however, we observed contacts between N and CH regions, CH and P regions, and P and C regions. These regions may be prone to forming short b-hairpins and discrete segments of parallel b-strands rather than stacking in-register b-sheet conformations. These short b-hairpins and discrete segments of parallel b-strands were found linked to early-onset aggregation (Masuda et al., 2008;Wu et al., 2009;Yan & Wang, 2006).
Comparison of conformation clusters sampled by the two force fields in Ab42 to reveal the fingerprints of the fibril structure To compare the difference in conformer distribution with the two force fields and to glean further insights into the heterogeneity of Ab42 structures, we used conformer clustering. We clustered conformations sampled from cMD, and each analysis combined MD runs with 3 random number seeds to obtain a total of 30000 snapshots. The MD runs A1 (1, 2, 3) and A1I (1, 2, 3) generated 104 and 208 clusters, respectively. When MD runs were initiated with the Ab42 b-strand conformation, we obtained fewer clusters: 95 and 106 clusters from simulations B1 (1, 2, 3) and B1I (1, 2, 3), respectively ( Figure  S8). The population of monomeric states in Ab 1-42 highly depended on the force field used. Hence, ff14IDPSFF strongly favored the disordered states and therefore a high ratio of clusters. The inset in Figure S8 shows the population percentage of the top 20 clusters. Initiated by a b-strand conformation, ff14SB sampled a few conformations with a large population, whereas ff14IDPSFF produced more heterogenous conformers with a similar population. Similar results can be seen in MD runs A1I (1, 2, 3) and A1 (1, 2, 3). ff14SB sampled a few highly popular conformations as compared with the clusters generated by ff14IDPSFF, although the differences were not as pronounced as runs initiated by a b-strand conformation.
The difference in the monomeric states of Ab42 can be further visualized by checking the most popular conformations obtained with the two force fields. Figure 7(A) and (B) present the eight largest clusters: the top eight clusters occupied 38.2% and 33.3% of the total conformer population from A1 (1,2,3) and A1I (1,2,3), respectively. Starting with Ab42 a-helix , b-sheets and b-strands were generated with ff14IDPSFF [AII (1,2,3)], mainly located at the N-and C-termini as well as the CH region. The central polar region was dominated by coil and turn conformations. The preference for b-sheets and anti-paralleled b-hairpins is not uncommon: they were also observed and reported in studies using GROMOS ffG53a6 (Olubiyi & Strodel, 2012), AMBER99SB and OPLS (Carballo et al., 2017), and ff14IDPSFF force fields .
With ff14SB, Ab42 a-helix produced a large amount of helical structures, thus indicating little change throughout the simulations. Such rich helix conformation was also observed in the studies of Gerben et al., using AMBER 03 and CHARMM22 þ CMAP (Gerben et al., 2014), and Carballo-Pacheco et al., using the AMBER99SBILDN-NMR force field (Carballo et al., 2017). The exception is in the C-terminal, which showed a short b-hairpin and short 310 helix with about the same percentage, which became the dominant conformation when using GROMOS9653a6 (Gerben et al., 2014) and GROMOSffG43a2 (Olubiyi & Strodel, 2012). In the cluster conformation obtained in Rahman et al., using the same initial structure and force field, among the top three conformations reported with their six tested force fields, only ff03w conferred an anti-parallel b-hairpin conformation. The remaining conformations were disordered or had helix conformations (Rahman et al., 2020) which could be due to simulation protocol or analysis criteria differences.
For Ab42 b-strand , the top eight clusters occupied 60.9% of the total conformer population (Figure 7(C)) and 53.0% of the total conformer population (Figure 7(D)) for B1 (1,2,3) and B1I (1,2,3), respectively. With ff14IDPSFF, B1I (1,2,3), Ab42 b-strand maintained a large percentage of b-sheet and coil structures in the top eight clusters, so ff14IDPSFF produced a reasonably heterogeneous ensemble of structures dominated by b-hairpin and anti-parallel b-sheets. With ff14SB, MD run B1 (1,2,3), Ab42 b-strand maintained much higher percentages within each cluster, so the force field sampled a small number of structures. The distinct structures were attributed mostly to the formation of short helix, coil and turn conformations that encompass most of the Ab 1-42 sequence. Surprisingly, a high percentage of b-hairpins and b-sheets in anti-paralleled orientations were also observed in the C-terminus and extended to residue G25 (9.8% in cluster #2), which agrees well with the NMR study (Ball et al., 2013).
Our cluster conformation analysis revealed some fingerprints linked to the onset of early fibril-like structures, which were also seen in other studies (Barz et al., 2021;Chakraborty et al., 2020). This observation may suggest that self-assembly could be initiated in the C-terminal because of mostly non-polar residues.
Conformers can be trapped easily in a local energy minimum when using cMD. To overcome this issue, we extended the sampling by performing aMD simulations, with which we obtained some conformations that may not be seen in cMD. Adding aMD in simulation studies can reproduce structures and results in close agreement with experiments in other studies (Doshi & Hamelberg, 2015;Markwick et al., 2007;Rodriguez et al., 2016). We simulated 100 ns for each system but acknowledged that our systems did not reach convergence in such a short time because convergence depends on the property of interest. Among three different boost potentials, 0.2, 0.5 and 1.0, we further characterized the conformers of clustering by selecting the simulation performed under the 0.5 boost potential. The rationale is based on the trajectories of the timeline for the secondary structures ( Figures S4-S6). The 0.2 boost potential in aMD overestimated the structured conformers, such as the helix and sheet structures, whereas the 1.0 boost potential in aMD overestimated the disordered conformers, such as the turn and coil structures. By justifying the suitability of the 0.5 boost potential in the aMD for Ab42, we applied the K-means clustering method for the trajectories of all aMDs and obtained conformational diversity ( Figure S9). With ff14IDPSFF used for both MD runs A3I (1,2,3) and B3I (1,2,3), structurally, with a few exceptions, the clusters exhibited an extended length of b-strand contents in their top 5 clusters but low helical content for the a-helix conformer [A3 (1,2,3)] and a high degree of unstructured coils and short segments of helical content for Ab42 b-strand [B3 (1,2,3)]. As a result, the conformers obtained with the aMD simulations may be supplemental to the conformational ensembles, which could not be revealed by cMD simulation.
Investigation of solvent accessible surface area (SASA) and two key salt bridges, E22/K28 and D23/K28, to reveal solvent effect in structure stability and conformation We conducted SASA analysis to investigate the solvent effect in structure conformation. The distribution of SASA values was relatively broader and right-shifted for both monomers by using ff14IDPSFF (Figure 8). Particularly, the result for Ab42 a-helix simulations had another significant peak near 41 nm 2 , which indicates a fraction of extended conformations exposed to the solvent. This finding might suggest that the ff14IDPSFF force field was able to simulate some particular unfolded states, which may not be feasible with ff14SB. The estimated SASA values of 32.8, 33, 32.2 and 33.3 nm 2 for all four working systems were in good agreement with other studies (Huy et al., 2016;Krupa et al., 2019); a value close to 41 nm 2 was reported by Duan et al. (2015), using AMBERff03, and Watts et al. (2017), using the Ab40 peptide with CHARMM36. Finally, we measured the salt bridge distance formed by two pairs of residues, E22/ K28 and D23/K28, which were established by NMR studies. The salt bridge formed between these residues stabilized the Ab42 peptide, which then in the later stage, formed Ab fibrils (Reddy et al., 2009;Sciarretta et al., 2005). We set a cutoff of 4.0 Å between the positive charge atom N involving the sidechain K28 and the negative charge atom O involving the sidechain E22 or D23. We obtained a total of 3000-ns time lengths by combining three MD runs for each system. To find the frequency of the salt bridge formation throughout all MD simulations, we calculated the average estimated time for salt bridge formation over the total run time of the MD simulation, as exemplified in equation 1 below: P ðestimated time length of salt bridge formationÞ ðtotal MD simulation runtimeÞ Ã100% (1) Figures S10(A) and (B) show the distance of salt bridges E22/K28 and D23/K28, respectively, over 3000 ns. Using equation 1 with ff14SB, the estimated total time of salt bridge formation was $1100 ns over the total run time (3000 ns), for a $37% frequency of E22/K28 salt bridge formation. In contrast, with ff14IDPSFF, the frequency of E22/K28 salt bridge formation was $9% of run time (orange line [A1I (1,2,3)]). D23/ K28 ( Figure S10(B)) formed a salt bridge infrequently, at $1.3% (blue) with ff14SB and $15% (orange) with ff14IDPSFF. Starting from Ab42 b-strand , the E22/K28 salt bridge was unfavorable with the two force fields (yellow and gray lines), but the frequency of D23/K28 with ff14IDPSFF was increased, $10% [B1I (1,2,3)]. Thus, salt bridge interactions between E22/K28 and D23/K28 should exist alternately to dominate the stabilization of the central polar region, thus resulting in large conformational changes and accessing the solvent dynamically.

Conclusions
Because of their highly flexible properties, use of IDP-specific force fields is essential to accurately model conformations and transitions of IDPs (Nguyen et al., , 2021Strodel, 2021). Hence, studies have examined the conformations and protein dynamics sampled by using experiments and molecular simulations of amyloid peptide (Bhattacharya et al., 2019;Carballo et al., 2017;Krupa et al., 2019;Man et al., 2017;Rahman et al., 2020;Robustelli et al., 2018). Here we used a recently developed but not extensively examined IDP-specific force field, ff14IDPSFF, to sample conformations of Ab42 monomers and compared the conformers obtained by a classical force field, ff14SB, by both cMD and aMD. Our study clearly showed that the heterogeneity, dynamics and conformations of Ab42 heavily depended on the initial structure and force field used. Of note, system convergence depends on the property of interest, and our purpose was to ensure a fair comparison in the simulation protocols for two force fields with two starting structures. We calculated the predicted structural chemical shifts and compared the CS observables with the NMR experiments. The force field ff14IDPSFF performed qualitatively well in all tested CSs except for Ca in Ab42 a-helix , whereas ff14SB reproduced the Ca CS observable qualitatively well in Ab42 a-helix . We analyzed the probability distribution of the Rg, secondary structural contents, and cluster conformers. Our Rg and secondary structural content values modeled with both ff14IDPSFF and ff14SB force fields are in qualitative agreement with findings from other studies (Chakraborty et al., 2020) One key finding noted in our work is that starting with Ab42 a-helix , the ff14IDPSFF force field sampled a high ratio of disordered structures and diverse b-sheet secondary structures, whereas ff14SB simulated high helical content in the MD simulations. This cooccurrence was also observed in other similar studies Rahman et al., 2020). The conformation obtained from the Ab42 b-strand MD simulations maintained a balanced content in the disordered and helical structures with ff14SB, but the conformers clearly favored disordered and b-sheet structures with ff14IDPSFF. Detailed analysis in the contact map and cluster conformation showed that the C-terminal region (mainly M35-V40) contained in-registered anti-parallel b-sheet (b-hairpin) conformations in both monomers with both force fields. The force field ff14IDPSFF sampled more diverse conformations, which could be associated with self-assembly of onset of early fibril-like structures. SASA and salt bridge analysis (E22/K28 and D23/K28) revealed the simulated unfolded structure, which fell into the central polar region in the peptide.
This study compared two distinct initial structures simulated by two force fields, and the IDP force field ff14IDPSFF is suitable to simulate flexible Ab42 monomers. However, further tests for force field improvement are still needed to precisely describe the IDP aggregation, for instance, Ab42 dimerization and polymerization. Considering other factors such as solvent models and simulation protocols (simulation time lengths, temperature condition, etc.) is also required to improve the accuracy of MD simulation for better modeling the Ab42 peptide and other IDPs.

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
This study was supported by the US National Institutes of Health (GM-109045).