Molecular Characterization of Buffalo Haptoglobin: Sequence Based Structural Comparison Indicates Convergent Evolution Between Ruminants and Human

ABSTRACT Haptoglobin (Hp) protein has high affinity for hemoglobin (Hb) binding during intravascular hemolysis and scavenges the hemoglobin induced free radicals. Earlier reports indicate about uniqueness of Hp molecule in human and cattle, but in other animals, it is not much studied. In this paper, we characterized buffalo Hp molecule and determined its molecular structure, evolutionary importance, and tissue expression. Comparative analysis and predicted domain structure indicated that the buffalo Hp has an internal duplicated region in α-chain only similar to an alternate Hp2 allele in human. This duplicated part encoded for an extra complement control protein CCP domain. Phylogenetic analysis revealed that buffalo and other ruminants were found to group together separated from all other non-ruminants, including human. The key amino acid residues involved in Hp and Hb as well as Hp and macrophage scavenger receptor, CD163 interactions in buffalo, depicted a significant variation in comparison to other non-ruminant species. Constitutive expression of Hp was also confirmed across all the vital tissues of buffalo, for the first time. Results revealed that buffalo Hp is both structurally and functionally conserved, having internal duplication in α-chain similar to human Hp2 and other ruminant species, which might have evolved separately as a convergent evolutionary process. Furthermore, the presence of extra Hp CCP domain possibly in all ruminants may have an effect during dimerization of molecule in these species.


Introduction
Haptoglobin (Hp), an acute-phase plasma protein, binds with free hemoglobin (Hb) during hemolysis and inhibits oxidative damage by the highly reactive haem groups to safe guard body tissues (1,2). Owing to Hp-Hb complex formation, it further prevents renal filtration of Hb and possible damage to the kidney (3). Hp has been reported to be primarily synthesized in response to tissue injury and inflammation, resulting from many diseases and autoimmune disorders among various species (4). In bovines, the concentration of Hp protein alters during inflammation and stress (5). In cattle, Hp has been identified as one of the moderately increased acute phase proteins at the site of inflammation during disease production process such as Pasteurellosis, mastitis, Foot and Mouth disease, etc. (6)(7)(8)(9). The concentration of Hp is found to be higher in the serum and milk from the cows with mastitis (10) and provide valuable diagnostic information in the detection and monitoring of the disease (11,12). Hp is primarily synthesized in the liver in response to any tissue injury or inflammation (4); however, in mammalian species such as cattle and mice, evidence suggests that Hp is also constitutively produced in most of the tissues under healthy conditions (13).
Hp protein consists of two aand two b-chains, connected by disulfide bonds obtained after proteolytic cleavage of a precursor polypeptide translated from the single mRNA transcript (14,15). The a-chain, contains a complement control protein (CCP) domain mainly, involved in assembling of Hp molecules, whereas b-chains having serine protease (SP) domains, helps in interactions with Hb (16). Furthermore, a region "loop 3" (residues 258-274) of SP domain is responsible for the recognition of macrophage scavenger receptor CD163, which ultimately leads to endocytosis and degradation of the Hp-Hb complex. It has also been found that mutations at Val 259, Glu 261, Lys 262, and Thr 264 disrupt the CD163 binding (17). Recently, Arg-252 and Lys-262 present in the CD163 binding loop of Hp were found to be the most essential residues for the high affinity receptor binding (18). Two allelic variants, Hp1 and Hp2, are reported in humans with three genotypes Hp1-1, Hp1-2, and Hp2-2 (19,20). Evidence suggests that the human Hp2 variant has arisen by the intragenic duplication of the CCP domain due to a non-homologous and probably random cross-over within different introns of two Hp1 genes (21). Human Hp2 allele is present around the world, however, it seems to be originated as a result of intense genetic pressure and have a selective advantage over the Hp1 allele (22). In many mammals, Hp has been found to be similar to human Hp1 allelic variant. However, partial duplication in a-chain of Hp gene has been reported in cattle, homologous to human Hp2, thereby suggesting convergent evolution between human and cattle Hps (23).
Except for few reports on its molecular and biochemical characterization, the structural and functional analysis of the Hp molecule has not been revealed in bovine species (10,23,24). Moreover, reports on convergent evolution of Hp2 molecules of cattle and human have generated the interest of workers to look into such phenomenon occurring in other bovines (23). Buffalo (Bubalus bubalis) is one of the important bovine species, similar to cattle, which contribute significantly to milk, meat, and draft. Being well adapted to adverse tropical conditions with a wider temperature range and having enormous contribution to the livestock economy of Southeast Asian countries, this species creates keen interest in the genetic basis of its unique attributes. In this study, we investigate the molecular structure of the Hp protein of Indian buffaloes, derived from the translated amino acid sequence of gene and its evolution.

Tissue collection and RNA isolation
Buffalo tissue samples, namely skeletal muscle, intestine, adipose tissue, lungs, uterus, testis, kidney, rumen, mammary gland, and heart were collected from an abattoir, immediately after slaughter of single animal. These tissues were selected to identify and assess the expression of Hp in nonconventional tissues other than the liver of a healthy animal. Collected samples were rinsed with phosphate buffer saline (PBS) and chopped into small pieces. Furthermore, these samples were transferred into CryoVials containing RNALater (Ambion) and were kept at room temperature for 2 hours. CryoVials containing tissue samples were transported to the laboratory in liquid nitrogen at -196°C and finally preserved at -80°C until further processing. Total RNA was isolated from 100 mg of homogenized tissue into TRIzol (Invitrogen) and further purified using an RNAeasy MinElute kit with on column DNase treatment (Qiagen). The quality and quantity of the purified RNA were assessed by taking OD260 and OD280 nm on a UV spectrophotometer (NanoDrop, ND-1000). The ratio of absorbance at 260 nm and 280 nm was approximately 2 for each tissue, indicating a satisfactory level of purity of the RNA. Furthermore, RNA isolated from different tissues was utilized for cDNA synthesis.

RT-PCR amplification and sequence analysis
The cDNA was synthesized using 2 µg of the total RNA with the help of Oligo(dT) primers and reagents supplied in RevertAid first strand cDNA kit (MBI Fermentas), following manufacturer's instructions. For the characterization of buffalo Hp gene, two pairs of overlapping oligonucleotide primers were designed using the Primer3 program (http://www.ncbi.nlm.nih.gov/tools/primerblast/) and utilizing the cattle sequence (GenBank Acc. No.NM_001040470, Table 1) to amplify the complete ORF. PCR amplification was performed using cDNA synthesized from buffalo mammary gland into a 25-(L volume with final concentrations of 1.5 mM MgCl 2 , 200 (M dNTPs, 10 pmol each (forward and reverse) primer, 1X PCR buffer, and 1U Taq DNA polymerase (MBI Fermentas). Amplification conditions were as follows-initial denaturation at 94°C for 3 min, thirty cycles of denaturation at 94°C for 30 sec, annealing at 55°C for 30 sec and extension at 72°C for 1 min followed by final extension at 72°C for 10 min. After confirming the amplification of the specific product by electrophoresis in 1.5% agarose gel, both the PCR products representing the complete coding region of the buffalo Hp gene were sequenced from both ends, by using single-pass sequencing on ABI PRISM 3100 Genetic Analyzer. Sequence overlaps from each of the samples were processed to obtain a contiguous sequence of complete Hp gene.

Sequence analysis of mammalian Hp and domain structure prediction
Basic Local Alignment Search Tool (BLAST, http://blast. ncbi.nlm.nih.gov/Blast) was used to estimate the homology between buffalo Hp gene and sequences of different mammalian species (Table 2). Sequence alignment analysis was carried out by UNIPROTKB software (http://www.uniprot.org/blast/uniprot/). The domain structure of different mammalian Hps was predicted by the Simple Modular Architecture Research Tool (SMART), available in the public domain (http://smart. embl-heidelberg.de/smart/).

Phylogenetic analysis of mammalian Hp
The evolutionary history of buffalo Hp based on complete mRNA and amino acid sequences was derived by the Neighbor-Joining method (25) using MEGA5 program (26). To construct a phylogenetic tree, the sequences were aligned using Multiple Sequence Alignment (MUSCLE; 27) and evolutionary distances were computed using the p-distance method (28). Bootstrap values were obtained using 500 replicates.

Tissue distribution and expression analysis of buffalo Hp
Expression profiling across the tissues was carried out by semiquantitative PCR, using first primer set (HAP_CD1 F and HAP_CD1 R) to amplify the buffalo Hp cDNA synthesized from RNA of nine different tissues collected. House-keeping gene Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) was also coamplified to compare the expression of Hp in different tissues. Amplification conditions were the same as those previously described, recording the results by horizontal gel electrophoresis, and expression of the conditions was ascertained by comparing the intensity of amplified products of Hp and GAPDH genes in specific tissue, as well as across the tissues.

Sequence analysis of buffalo Hp
Complete ORF of buffalo HP mRNA was amplified from buffalo mammary gland tissue into two different overlapping fragments and sequenced from both ends. These fragments were assembled into approximately 1.3 kb single contig using the SeqMan program (Lasergene software, DNASTAR Inc.) and submitted to the GenBank (Accession Number JX838852). Buffalo Hp mRNA sequence included 1206 nucleotides long region, encoding for 401 amino acid residues long polypeptide chain, similar to cattle and Ibex goat; however, 54 residues were longer than that of nonruminant species except for human ( Table 2). Internal duplication of a 162 nucleotides sequence led to the increase in sequence length of buffalo Hp, which is similar to cattle Hp and human Hp2 sequences compared to other species (21,23). Structurally, the nucleotide sequence of buffalo Hp gene alignment indicated the presence of seven exons as reported in cattle Hp and human Hp2.
Buffalo Hp mRNA sequence exhibited 93 to 97% nucleotide identity with other ruminant species, however 72 to 81% with that of nonruminant species. Interestingly, the buffalo Hp and human Hp2 sequences had only 75% nucleotide identity, despite their similar internal duplications inside the genes. Amino acid sequence homology of Hps was lower than corresponding nucleotide identities among different species. Low amino acid identity seemed to be due to the higher number of nonsynonymous substitutions in Hp gene. Predicted amino acid sequence of the buffalo Hp revealed 16 amino acid changes resulting from 35 nucleotide substitutions as compared to cattle Hp. The comparison of Hp sequences at nucleotide, as well as amino acid levels, revealed higher homology within the ruminant compared to nonruminant species, indicating significant divergence between these two groups.
The molecular structure of the buffalo Hp was retrieved from the UniProtKB database, which revealed that buffalo Hp mRNA encoding for both a and b polypeptide chains in the same reading frame, similar to that of human (19) and cattle (23). It is noteworthy that the Hp protein found in most of the mammals consists a signal peptide, along with a and b chain structures (29). A number of conserved regions were observed in buffalo and other ruminants Hp (Fig. 1). In fact, barring the variations in a chain, molecular structure of Hp was almost similar for all the species. Across the species, both a and b chains are linked by a single arginine residue in their precursor molecule, which is released during the proteolytic maturation, generating a and b subunits (19,29). The buffalo Hp sequence also had the conserved interlinking arginine, indicating a similar process being followed for maturation of buffalo Hp. Amino acid sequence homology as well as conserved cleavage site further strengthened the view of common origin for Hp and the serine protease family (14).
Alpha and beta chains of buffalo Hp were found to be 137 and 245 residues long, respectively. Literature suggests a chain to be highly variable, giving rise to structural variations among Hps of different species (14). Contrastingly, the b chain was found to be similar in size in all of the species including ruminants and nonruminants. Buffalo Hp a and b chains had high homologies with corresponding chains of the other ruminant species, but low with nonruminant species (Table 3). However, overall homology among b chains was higher compared to a chains, indicating highly conserved nature of b chains across the species, especially among ruminant species.

Phylogenetic analysis of buffalo Hp
Phylogenetic tree was derived from Hp mRNA sequences of different mammalian species to establish evolutionary relationships. Phylogenetically, buffalo  and other ruminants were found to group together separately from all other nonruminants, including human ( Fig. 2A). Based on nucleotide sequences of Hp, within ruminants, the buffalo was found comparatively closer to the cattle than Ibex goat and red deer. Phylogenetic analysis of conceptualized amino acid sequences of mammalian Hp also revealed a similar kind of evolutionary lineage (data not shown). Interestingly, ruminants Hp and human Hp2 had internal duplications; however, phylogenetic analysis of these sequences did not reveal the desired closeness between human and ruminant species. These results thus indicated that internal duplication in Hp genes of humans and ruminants might have occurred as two separate events. Thus, common but separate evolutionary processes of the Hp genes among human and ruminants at different time lines indicates their convergent evolution. This event was also confirmed in separate phylogenetic tree for a-chains containing CCP domains of different Hps (Fig. 2B), revealing two distinct and completely separated groups of Hps; one comprising the ruminants and another having the nonruminants including human Hp2. Hp a-chains of all the ruminants formed distinct lineage, which strongly indicates that a-chain in ruminants evolved independently from other nonruminant species. Earlier studies also revealed that Hp2 in humans arose by the intragenic duplication of a 1.7 kb DNA fragment of the Hp1gene after the divergence of humans during late primate evolution (21). Furthermore, it can be hypothesized that in ruminants, the Hp gene diversified after duplication within a-chain of Hp gene and that all the ruminant species have duplicated CCP domain, which might be advantageous during evolution.

Domain structure analysis of buffalo Hp
The domain structure of Hp amino acid sequences in buffalo and other species were predicted by using Simple Modular Architecture Research Tool (SMART) analysis (Suppl. Figure 1). The results revealed that buffalo Hp has four domains, one signal peptide (amino acids [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19], two complement control protein (CCP), and one serine protease (SP) domains. The domain structure of buffalo Hp was found similar to that of human Hp2 and  Buffalo  66  -59  -74  -63  -80  Cattle  59  98  57  88  72  94  61  97  80  Ibex goat  63  95  59  91  74  93  63  96  79  Red deer  62  98  62  86  71  90  60  92  82  Human  97  ----- cattle Hp (23). The CCP domain is mainly involved in assembling of Hp molecules, whereas SP domain forms extensive interaction with Hb (16). In buffalo Hp, CCP and SP domains were located in a and b chains, respectively, similar to cattle (14). Domain boundaries were also predicted and confirmed by UniProtKB and corroborated with the findings of SMART analysis. The analysis also revealed similar domain structure with an identical number of Hp domains and their sizes for other ruminants; however, the ruminants' Hp domain structure was completely different from nonruminant species except human Hp2 allele. Only ruminants Hp and human Hp2 were found to have a total of four domains. Surprisingly, Hp molecules of all the nonruminant species, except human, were lacking 54 amino acids long CCP2 domain. Dog and cat Hp had no CCP domains, whereas pig and mouse Hp had only one CCP domain. In contrast to CCP domains, SP domain was highly conserved in size across all the species compared. The predicted CCP1 and CCP2 domains of the buffalo Hp were 52 and 54 amino acid residues long respectively, similar to all other ruminants. All the four domains observed in buffalo Hp were also found highly conserved among the ruminants. Buffalo CCP1 domain was two residues shorter than that of nonruminant species. In human, 54 amino acid residues long duplicated CCP1 and CCP2 domains had only two amino acid variations (Asp40Asn and Glu41Lys); however, in buffalo and other ruminants, Asn40 in CCP1 was substituted by Serine in CCP2 and Lys41 was found deleted in CCP1. These altered substitutions in human and ruminants at corresponding positions of Hp indicate two separate duplication events for human and ruminants during evolution.
To further explore the duplicated regions in Hp of buffalo and other ruminants, CCP domains were analyzed for internal repeats of amino acid residues. Alignment of Hp amino acid sequences revealed a number of blocks of internal repeats interspersed in ruminants Hp (Fig. 1), but repeats were less frequent compared to human Hp2. The amino acid identity between CCP1 and CCP2 domains for human Hp was 97%, whereas it ranged from 59% to 66% in ruminants, which also explains the less numbers if internal repeats observed (Table 2). On the other hand, within ruminants, very high amino acid identity for corresponding CCP1 (higher than 95%) and CCP2 (higher than 86%) domains were present. Functionally, these duplicated CCP domains give rise to different high molecular weight oligomers, which can affect hemoglobin binding properties of Hp. In human, the larger Hp2 molecule with two CCP domains was found to have weaker hemoglobin binding as well as antioxidative activities but stronger angiogenic effect compared to Hp1 molecule. Hp2 phenotype is also ove-represented in autoimmune diseases and is favored under strong genetic pressure in humans (30). It is presumed that the human Hp2 allele seems to be originated as a result of intense genetic pressure and have a selective advantage over the Hp1 allele (22).
Sufficiently large numbers of amino acid differences were observed between ruminant and non-ruminant species in "loop 3" (residues 258-274) region of Hp, representing SP domain (b chain) associated with macrophage scavenger receptor CD163 binding. In buffalo and other ruminants, the residues essential for CD163 recognition showed variations at arginine252lysine, Valine259arginine, glutamate261lysine, and lysi-ne262asparagine as compared to human. In human, amino acid residues Val259, Glu261, Lys262, and Thr264 within VPEKKT motif in loop 3 of SP domain, were found to be essential for CD163 recognition by the Hp-Hb complex and mutation at these residues were found to disrupt the CD163 binding (17). Three ruminant species, cattle, buffalo, and wild goat, had APKNKT motif at the same position instead. Recently, in the Hp-Hb complex ligand, Hp residues-only Arg-252 and Lys-262 identified in the CD163 binding loop have been revealed as essential residues for the high affinity receptor binding (18), both of which were replaced with Lys-252 and Asn262 in ruminants. Our findings thus show loop 3 region, specifically VPEKKT motif, not being structurally and functionally conserved across the species for recognition of CD163, and possibly some other residues may be involved in CD163 recognition in case of ruminants.

Expression analysis of buffalo Hp
RT-PCR amplification was performed to confirm the expression of transcript variants in different tissues of buffalo. Semiquantitative RT-PCR confirmed the amplification of only one Hp transcript in all of the tissues. that is, skeletal muscle, intestine, adipose tissue, lungs, uterus, testis, kidney, rumen, mammary gland, and heart without any alternate splice variant in any of the tissues (Fig. 3). However, as evident by band intensity, the expression of Hp seems to be varying in different buffalo tissues compared to the GAPDH housekeeping gene taken as the control. The results indicate the constitutive expression of buffalo Hp mRNA in all the tissues of healthy buffalo even in the absence of systemic acute phase response, as reported in bovines, in which it has been found to be constitutively expressed in most of the vital tissues normally (13). Depending upon the band intensities, the expression of Hp gene was low in intestine and stomach, whereas it was high in adipose tissue and mammary gland. Our findings are in corroboration with other workers, who reported very low expression of Hp mRNA in the abomasum and small intestine of cattle (13,31). In addition to serum amyloid A3, Higher concentration of Hp in the milk has also been found to be associated with mammary gland inflammation in cattle (10,32).
Our results thus conclusively suggest that CCP domains of Hp in buffalo and other ruminants, originated as a result of internal duplication in a-chains. However, this duplication event is completely separate from the internal duplication of CCP domains found in human Hp, which indicates their convergent evolution at different points of time. Furthermore, CD163 binding motif in Hp of three ruminant species was conserved, varying from human Hp. Buffalo Hp also showed constitutive expression across different healthy tissues with wider biological significance.