Dynamic and structural properties of porcine serum albumins

ABSTRACT Porcine serum albumin (PSA) is one of the promising biomarkers for pork detection. Pork contamination is a serious concern for the global halal food industry since many manufacturers mix pork into halal beef products to reduce production costs. Many studies have thus been devoted to designing effective PSA-detecting biosensors. PSA is closely related to Bovine serum albumin (BSA); therefore; the molecular insight into PSA characteristics becomes crucial to identify PSA. To understand PSA properties, Molecular dynamic (MD) simulations were employed. The three-dimensional structures of PSA were obtained from homology modelling and Alphafold. Both models give similar results. PSA seems to have high hydrophobicity and unique electrostatic properties. Unlike BSA, PSA has no large electropositive patch on the rear of domain III. This property can be used to differentiate PSA from BSA. In the case of drug sites, PSA provides comparable sizes of drug sites to those of canine serum albumin (CSA) which are larger than those of bovine, human and feline albumins. Such larger binding pockets can imply the ability of PSA to accommodate a broader spectrum of ligands. The findings here, especially the difference between BSA and PSA, can serve as a base to design effective biosensors to detect PSA contaminants.


Introduction
Porcine serum albumin (PSA) is the most abundant protein in blood plasma. Like other albumins, PSA can bind and transport a broad spectrum of nutrients and ions [1]. PSA is one of the potential molecular biomarkers used for detecting pork contamination [2][3][4][5][6]. Pork contamination is a serious issue for the global halal food industry since many manufacturers mix pork into halal beef products to reduce production costs. Beef products contaminated with pork or pork derivatives are forbidden for Muslim communities; therefore, many attempts have been made to design effective and accurate methods for pork contamination detection [2][3][4][5]7,8]. Many techniques, such as enzyme-linked immunosorbent assay (ELISA), label-free immunosensor [8,9], aptasensor [10] and molecularly imprinted polymer (MIP)-based sensor [3,4,11], were employed to detect PSA residues. Such methods require precise PSA recognition by receptors. Selectivity for PSA is vital for pork contamination sensors. The understanding of the structural and dynamic properties of PSA biomarkers in a molecular detail becomes crucial for designing highly selective PSA receptors in biosensors. Thus, in this work, the structural and dynamic characters of PSA are investigated. The similarity and dissimilarity between PSA and Bovine serum albumin (BSA) found in beef are extracted. This information will be beneficial for designing selective and specific PSA detection strategies in beef products.
PSA contains 583 amino acids. Like other mammal albumins, PSA has an overall heart-like shape consisting of αhelices without β-sheets [12,13]. PSA has three domains (domains I, II and III), each of which contains two subdomains, denominated subdomains IA, IB, IIA, IIB, IIIA and IIIB (Figure 1(A)). PSA contains two drug sites (Drug sites I and II (Sudlow sites I and II)) ( Figure 1(A)) which can also be seen in human serum albumin. Drug site I (warfarin-azapropazone binding site) is located in subdomain IIA and drug site II (indole-benzodiazepine binding site) is in subdomain IIIA [14,15] (Figure 1(A)). Recently, subdomain IB has been recommended as the third drug site, which shows multiple drug recognition abilities [16]. All albumins contain 17 conserved disulphide bridges to stabilise a protein structure [17]. In most albumins, one cysteine at position 34 (C34) is out of a disulphide network. This free C34 was reported to bind metal ions [18] and act as a covalently-bound drug carrier [19][20][21]. In general, PSA shares 75.00% and 79.42% sequence identities to human (HSA) and bovine (BSA) serum albumins (the sequence alignment can be seen in Figure 1(C)). PSA shows the highest sequence similarity to BSA where domain III appears to be highly conserved ( Table 1). The structure superimposition of PSA with BSA in Figure 1(B) displays a high similarity of native folds between both structures. A major difference seems to be the orientation of domains I and III ( Figure 1B). Further dynamics is explained later.
To date, no three-dimensional structure of PSA is available. Thus, in this work, the homology models of PSA were obtained from homology modelling (MODELLER [22]) and AI-assisted modelling (Alphafold database [23]). Molecular Dynamic (MD) simulations were performed to explore the structural and dynamic properties of both models. MD simulations have been widely used to study biological systems [24,25], including albumins [26][27][28][29][30]. The similarity in character, structure and function between bovine (BSA) and porcine (PSA) serum albumins are investigated here. This work also provides critical information on the similarity and dissimilarity of PSA from bovine (BSA), human (HSA), Feline (FSA), Canine (CSA) serum albumins (data of other albumins (number of hydrogen bonds, solvent accessible area and electrostatic potentials) were obtained from previous studies [26,27]). An insight obtained here will be useful for designing specific and selective receptors for sensitive PSA biosensors. Furthermore, PSA was also reported to be a major allergen in pork products [31,32]. In addition, previous work reported the need for PSA depletion in biomarker discovery in serum for genetic evaluation and improvement [33]. Thus, the structural and dynamic characteristics of PSA become crucial for designing novel and precise strategies for effective PSA detection and extraction.

Materials and methods
The three-dimensional structures of porcine serum albumin (PSA) were obtained from Alphafold Protein Structure Database [23] (Uniport number: P08835) and MODELLER [22]. For PSA from MODELLER, a crystal structure of Bovine serum albumin (BSA) (PDB code: 3V03) was used as a template because of its highest sequence identity (79.42%) to PSA. 'AF' and 'MOD' were used for PSA models from the Alphafold database [23] and MODELLER [22], respectively. The protonation states of charged amino acids were set at physiological pH. Each structure was placed in a cubic simulation box (a dimension of 11 × 11 × 11 nm 3 ) solvating with  water (∼ 39,297 molecules) and counter ions (14 ions of Na + ). All simulations were performed using the GROMACS 2020 package (www.gromacs.org) [34] with AMBERFF99SB-ILDN forcefield [35]. To relax steric conflicts generated during setup, the energy minimisations of 1000 steps of steepest descent were performed. Long-range electrostatic interactions were treated using the Particle Mesh Ewald (PME) method [36] with a short-range cut-off of 1 nm, a Fourier spacing of 0.12 nm and fourth-order spline interpolation. All simulations were performed in the constant number of particles, pressure and temperature (NPT) ensemble. The temperatures of PSA and solvent with ions were each coupled separately using the v-rescale thermostat [37] at 300 K with a coupling constant τ t = 0.1 ps. The pressure was coupled using the Parrinello-Rahman algorithm at 1 bar with a coupling constant τ p = 1 ps. The time step for integration was 2 fs. Coordinates were saved every 2 ps for subsequent analysis. The 10-ns equilibration runs were performed and followed by the 500-ns production runs. A system was repeated twice with different random seeds (the suffixes of '1 and 2' are used to represent simulation 1 and simulation 2, respectively). The simulations of PSA were analysed in comparison to bovine (BSA) and human serum albumin (HSA) from our previous work [27,30].
All results provided here are the average values from 2 simulations. The data were analysed by GROMACS and locally written codes. Visual Molecular Dynamics (VMD) was used for graphic visualisation [38]. C-alpha RMSD and RMSF calculations were calculated using an initial structure from each production run as a reference. For Principal Component Analysis (PCA), it is calculated by default the 'gmx covar' and 'gmx anaeig' options in GROMACS. Only the first and second eigenvectors were used to analyse the major protein motion in all cases. The phylogenic tree was built using Sim-plePhylogeny (https://www.ebi.ac.uk/Tools/phylogeny/ simple_phylogeny/) where the multiple sequence alignment used was obtained using ClustalW [39].

Results and discussion
As reported earlier, PSA shares sequence identities of 75.00% to HSA and 79.42% to BSA (Table 1). PSA displays more sequence similarity to BSA, especially domain III (81% identity), whereas domain II of PSA is likely to resemble those of HSA and BSA (Table 1). A phylogenic tree also displays that BSA and CSA are in the same clades indicating the close relationship ( Figure S1 in supplementary information). The high-sequence identity between BSA and PSA observed here also indicates the difficulty in discriminating between these two. Comparing between BSA and PSA folds, only domains III in both albumins are found to be oriented differently (Figure 1(B)). In terms of structural flexibility, both MOD and AF models of PSA show a comparable degree of overall structural fluctuation (Figure 2(A)). RMSFs demonstrate domains I and III seem to be more mobile (Figure 2(B)). Each subdomain has no significant drift in RMSDs, although the AF model is found to show slightly lower RMSDs in subdomains IIIA and IIIB (Figure 2(C)). However, both show a similar trend. AF and MOD models also provide the same main secondary structure, while a small difference in turns and coils can be captured ( Figure S2 in supplementary  information).
To extract the dominant dynamics of PSA, Principal Component Analysis (PCA) was computed on C-alpha atoms (Figure 3(A)). The protein dynamics is computed from the first eigenvectors (principal component 1 (PC1)) which accounts for the major motions ( Figure S3 in supplementary  information). Like other albumins [26,27], PCA clearly demonstrates the motion of domains I and III is responsible for the main PSA motion (Figure 3(A)). The scissor-like motion of PSA is captured where the dynamics of domains I and III can be varied. The scissoring character seems to be observed in other albumins (human serum albumin (HSA), bovine serum albumin (BSA), canine serum albumin (CSA) and feline serum albumin (FSA)) from previous work [26,27,30]. Such motion may serve as a unique characteristic for each albumin. Moreover, self-hydrogen bonds within domains I and III were also computed (Figure 3(B)). Domains I and III of PSA provide a similar degree of self-interactions to HSA and BSA (Figure 3(B)).
Furthermore, other structural properties are also investigated in Figure 4. PSA is packed by ∼475 self-hydrogen bonds and forms ∼1,250 hydrogen bonds with water molecules ( Figure 4(A,B)). PSA produces the same number of self-interactions with other albumins (BSA, HSA, CSA and FSA) (Figure 4(B)). This finding is not unexpected since all vertebrate serum albumins share similar 3D structures and function similarly based on their evolution [41]. In general, both AF and MOD models provide similar structural and dynamic properties; however, some minor differences can be captured. Seemingly, such deviation is rooted in the slightly different orientation of domain III in both models (domain III shows the highest RMSD (0.13 nm)) ( Figure 4(C)).
In addition, solvent-accessible areas (SASA) were also calculated (Figure 4(D,E)). PSA appears to be less water exposure (SASA of ∼300 nm 3 ) than other albumins (SASA of ∼330-340 nm 3 ) (Figure 4(D)). This is because a PSA surface shows a slightly higher hydrophobic environment ( Figure S4 in supplementary information). In addition, PSA has two tryptophan-like BSA (W135 and W214 located in subdomains IB and IIA, respectively). This can help PSA to facilitate aromatic ligands and induce similar spectroscopic quenching characteristics to BSA. In the case of drug sites, in HSA, drug site II was smaller than drug site I [42]. Like HSA, PSA also shows a smaller drug site II (Figure 4(E)). The surface area of drug site I in PSA seems to be in between those of HSA/BSA/FSA/CSA and Ovine (OSA) (19.06 nm 2 ) and Caprine (CASA) (21.34 nm 2 ) serum albumins (SASA data of OSA and CASA were obtained from their X-ray structures (PDB codes: 4LUF (OSA) and 5ORI (CASA) [43]), while PSA and CSA share the similar size of drug site II (Figure 4(E)). This can imply the similarity of ligand-binding ability at this site. This similar pocket volume can explain why PSA and CSA showed a comparable degree of ligand binding in previous work [44]. It was reported that the drug-binding affinity of both drug sites depends on ligand-protein interactions and degrees of wettability [45]. Such large pocket cavities can thus impact the ligand-binding affinity of drugs or metabolites in each site. Drug site I binds warfarin (WAR) and phenylbutazone (PBZ), while drug site II is the indole-benzodiazepine binding site [14,15]. Thus, a large water-filled drug site I of PSA can reduce the binding affinity of WAR and PBZ. Not only drugs but also other metabolites such as hormones, fatty acids, steroids, etc. can be carried by albumins as well, thus it is also interesting to further study whether PSA dynamics cause any further difference in the transport of other metabolites.
Moreover, the electrostatic surface of PSA is also computed in Figure 4(F). The front and side of domain I show clear electronegative environments like other HSA, CSA, BSA and FSA [26,27]; however PSA still has its specific electrostatic properties. Unlike HSA, BSA, FSA and CSA, the large electropositive patch on the back of domain III is unseen in PSA (Figure 4(F)). This can induce the difference in ligand recognition. Seemingly, our finding agrees with previous work that the electrostatic surface acts as one of the unique features of each albumin [27,46]. Nonetheless, previous studies defined domain III as the binding site for an albumin-selective aptamer [47,48], neonatal Fc receptor (FcRn) [49][50][51] and growth hormones [50]. The loss of a large electropositive surface on the rear surface of domain III can alter the binding strength and recognition of such ligands in PSA. For example, FcRn and Immunoglobulin G (IgG) fragments were reported to bind the frontal region of domain III on HSA using electrostatic and hydrophobic interactions [50,[52][53][54]. The absence of electropositivity at the back of domain III on PSA here may disrupt the recognition of the FcRn/IgG fragment. A further experimental study is needed.
In the case of redox activity, the free cysteine at position 34 (C34) was reported to be redox-active [55]. C34 was found to be involved in redox homeostasis in the circulation, exerts anti-oxidative activity and oxidative stress [56] and brain rejuvenation [57]. The reactivity of the thiol group on C34 was also reported to be vital for albumin affinity toward drugs and fatty acids [58]. Their reactivities varied among albumins depending on the wettability of a thiol group. C34 has become of interest due to its ability to be a site-selective covalent drug conjugation [49,59]. In humans, the C34-Y84 interaction is important for modulating C34 reactivity and protecting C34 from reactions [60]. For HSA and BSA, the sidechain of C34 is trapped by the interaction with Y84 which protects C34 from a redox reaction [27]. In contrast, C34 in CSA and FSA was more reactive because no interaction with Y84 is observed [26,27]. Like CSA and FSA, although the C34-Y84 distance is constant (∼0.5 nm), no bond between each other is identified ( Figure 5(A)). Y84 seems to bend away from C34 and hydrogen bonds with Q33 instead (Figure 5(B,C)). Thus, the side chain of C34 here can freely move in space without any protein contact. This high flexibility implies the more reactive C34 of PSA. However, a further experimental study is required.

Conclusions
In this work, the dynamic and structural properties of PSA were studied. A comparison of PSA structure to other albumins (HSA, BSA, CSA and FSA) is also reported here. The PSA models from both homology modelling and AI-assisted Alphafold seem to produce similar structural and dynamic properties. Overall, PSA displays high sequence similarity to BSA, especially domain III. Such resemblance causes difficulty to detect PSA in PSA-contaminated beef products. Compared with other albumins (HSA, BSA, CSA and FSA), although all show the similar scissoring-like motions of domains I and III, each still provides a slightly different move of domain III whose motion seems to be the signature of each albumin. Compared with other albumins (HSA/BSA/CSA/FSA), PSA displays a higher hydrophobicity due to its less water exposure. In addition, PSA has unique electrostatic properties. Unlike other albumins, especially BSA, the large electropositive patch on the rear surface of domain III is absent in PSA. This can interfere with the binding affinity and recognition of domain III-specific ligands. However, this property can be used as a key to differentiating PSA from BSA. In the case of both drug sites, PSA provides comparable sizes of both drug sites to those of CSA which are larger than those of BSA/ HSA/FSA. Such larger binding pockets can imply the ability of PSA to accommodate a broader spectrum of ligands than HSA/BSA/FSA.
Although PSA shows high sequence identity to BSA, their structural and dynamic properties are non-identical. Both PSA and BSA have two tryptophan (W135 and W214), PSA should give similar fluorescence properties to BSA. Nonetheless, PSA contains much larger drug sites which can facilitate the binding of bulkier aromatic ligands. The size difference may interfere with the microenvironment of both tryptophans leading to the deviation of their quenching properties. Further study is required. In addition, the different electrostatic environments of the rear of domain III can be used to discriminate PSA from BSA. This zone of PSA is more electropositive which can significantly disturb the aptamer binding affinity. Using an albumin-selective aptamer-based sensor targeting domain III seems to be unsuitable for PSA detection. Furthermore, PSA contains highly flexible C34 which can imply the more reactive C34. This high flexibility can support further conjugation reactions and widen the use of PSA as a covalently-bound drug and oxygen carrier. The molecular insights into PSA structure and dynamics, especially the difference between BSA and PSA, here can serve as a base to design specific and selective PSA biosensors, especially molecularly imprinted polymer-based sensors, to detect pork contaminants in halal food.  (B). The Cyan band in (B) indicates the range of hydrogen bond numbers found in BSA, CSA, FSA and HSA from previous work [26,27]. (C) Superimposition of PSA models from Alphafold (AF) and MODELLER (MOD). RMSDs of each domain were computed using the AF model as a reference. The labels show the average RMSDs of each domain. (D) and (E) are solvent-accessible areas of the whole PSA and each drug site (drug sites I and II) in comparison to those from previous work [26,27]. The grey band and dashed blue and green lines refer to solventaccessible areas from HSA, BSA, CSA and FSA, respectively. (E) displays the electrostatic contour maps from MOD and AF models. Electropositive and electronegative surfaces are labelled in blue and red, respectively.

Disclosure statement
No potential conflict of interest was reported by the author(s).