Length variation and sequence divergence in mitochondrial control region of Schizothoracine (Teleostei: Cyperinidae) species

Abstract Schizothoracine fish commonly called snow trouts inhibit the entire network of snow and spring fed cool waters of Kashmir, India. Over 10 species reported earlier, only five species have been found, these include Schizothorax niger, Schizothorax esocinus, Schizothorax plagiostomus, Schizothorax curvifrons and Schizothorax labiatus. The relationship between these species is contradicting. To understand the evolutionary relation of these species, we examined the sequence information of mitochondrial D-loop of 25 individuals representing five species. Sequence alignment showed D-loop region highly variable and length variation was observed in di-nucleotide (TA)n microsatellite between and within species. Interestingly, all these species have (TA)n microsatellite not associated with longer tandem repeats at the 3′ end of the mitochondrial control region and do not show heteroplasmy. Our analysis also indicates the presence of four conserved sequence blocks (CSB), CSB-D, CSB-1, CSB-II and CSB-III, four (Termination Associated Sequence) TAS motifs and 15bp pyrimidine block within the mitochondrial control region, that are highly conserved within genus Schizothorax when compared with other species. The phylogenetic analysis carried by Maximum likelihood (ML), Neighbor Joining (NJ) and Bayesian inference (BI) generated almost identical results. The resultant BI tree showed a close genetic relationship of all the five species and supports two distinct grouping of S. esocinus species. Besides the species relation, the presence of length variation in tandem repeats is attributed to differences in predicting the stability of secondary structures. The role of CSBs and TASs, reported so far as main regulatory signals, would explain the conservation of these elements in evolution.


Introduction
The schizothoracine fishes belong to schizothoracinae, cyprinidae and cypriniformes, comprise 90 species (subspecies) and 12 genera. According to Cao et al. (1981), the subfamily Schizothoracinae originated from primitive barbine fish distributed in Tibet during the late Tertiary, and subsequently evolved into more specialized groups owing to the uplift of Tibet. The fish are believed to have migrated into the water of the Kashmir region from Central Asiatic south slopes of the Himalayas and Sulaiman range (Das & Subla, 1964). However, no palaeogeographical hypothesis has been proposed to account for the distribution pattern of Schizothorax, based on dispersal and vicariance. There is great confusion in the number of species present. Heckel (1838) for the first time reported 10 species of Schizothoracines from Kashmir. Silas (1960), however, framed the list of 12 species.
The morphometric characteristics to classify fishes of this region are highly contradicting, it is often difficult to decide whether these are different species, different phenotypes of single species or an intermediate situation between the two extremes (Raina & Petr, 1999). Based on recent survey by Talwar & Jhingran (1991), Kullander et al. (1999), Yousuf et al. (1996) and Balkhi (2005), there are only five recognized species within the genus Schizothorax from the Kashmir, India; these include Schizothorax esocinus (Heckel, 1838), Schizothorax plagiostomus (Heckel, 1838), Schizothorax labiatus (McClelland, 1842), Schizothorax niger (Heckel, 1838) and Schizothorax curiforns (Heckel, 1838). The uncertainty surrounding the phylogenetic relationship and biogeography of this group continues to the present day. During the last few years, the mitochondrial control region has been used to specifically study population structure, gene flow, migration, phylogeography and female nesting behavior assessment (Norman et al., 1994).
In this study, besides molecular phylogeography of the Schizothorax species, we report the special structural features of a microsatellite sequence within the mitochondrial control region of these species. To the best of our knowledge, this study is the first approach to investigate the role of mitochondrial microsatellite as an important marker to study inter-and intraspecfic variation within schizothracines.

Materials and methods
Taking into account potential problems with the identification of species (Talwar & Jhingran, 1991), all the five Schizothorax species were properly identified using the taxonomies of Jhingran (1991). Five individuals per species collected from different localities were subjected for DNA isolation (Table S1). DNA was isolated from muscle tissue following standard proteinase K, phenol/chloroform extraction protocol (Sambrook et al., 1989) D-loop was amplified using primers and PCR conditions of Thai et al. (2004).
We considered nodes with bootstrap values !70% (Hillis & Bull, 1993) and Bayesian posterior probabilities !95% to be wellsupported. Heuristic searches used for ML analyses utilized 100 replicates of random sequence additions followed by nonparametric bootstrapping comprising 100 replications with 10 random sequence additions. The NJ tree was constructed with distances calculated under the same model of evolution as the ML analysis with bootstrapping performed using 1000 replicates. Bayesian analyses were performed with random starting trees and run for 1.0 Â 10 6 generations and sampling the four Markov chains every 100 generations resulting in 10,000 trees. The likelihood scores of the sampled trees were plotted against generation time to ensure that stationarity was reached; trees generated prior to stationarity were discarded as ''burn-in'' (2000 trees in this case). Bayesian posterior probabilities of each bipartition, representing the percentage of times each node was recovered, were calculated from a 50% majority rule consensus of the remaining trees.

Result and discussion
Sequence alignment of control region showed length variation ranged from 785 to 793 bp in all the five species under study. When compared with sequences deposited in the GenBank, the length of completed-loop gene in Schizothorax genus was found to be 933 bp (Schizothorax macropogon, KC020113). The sequence composition and arrangement of the repeats varied considerably between and within species. Interestingly, the length variation was observed in di-nucleotide (T) n microsatellite with a variable number of repeat units (n ¼ 7-14). These species had a (T) n microsatellite not associated with longer tandem repeats in the 3 0 end light (L)-strand in of the mitochondrial control region. Comparison of repeat array lengths among five species of schizothracines reveals that the model number of microsatellite repeats varied from seven in S. esocinus to 14 in S. curvifrons species (Table S2). It is unusual to find a microsatellite not associated tandem repeats in the mitochondrial control region with different inter-and intraspecific variations. Hoelzel et al. (1993), for the first time, reported (AC) n GT microsatellite not associated with longer tandem repeats in the 3 0 end of the mitochondrial control region showing extensive heteroplasmy with up to three length variants present in single individuals of two elephant seal species. However, in our case, we did not find heteroplasmy in D-loop of the species. Heteroplasmy for length variation has been frequently observed (reviewed in Rand, 1993). The exact molecular mechanisms that create variations are not completely understood, but length changes in microsatellite DNA are much recurrent criticism of replication slippage -that is, transient dissociation of the replication DNA strands followed by misaligned reassociation (Rochej et al., 1990). Mitochondrial DNA length variations, caused by tandem repeats, have previously been identified in a number of fish species: sturgeon (e.g. Acipenser transfor montanus, Acipenser oxyrhynchus desotoi; Brown et al., 1992) cod (Gadus mor populations hua; Johansen et al., 1990), European sea bass (Dicentrarchus labrax; Cecconi et al., 1995), American shad (Alosa sapidissima; Bentzen et al., 1988), and Leuciscinae (Pisces: Cyperinid; Sasaki et al., 2007).
The nucleotide sequence alignment also allowed the identification of four conserved sequence blocks (CSB-D, CSB-1, CSB-II, and CSB-III). The sequence of conserved block D (CSB-D) is highly conserved within the family Cypernidae and showed more than 85% of the total identity belonged to other teleost (Table 1). According to Southern et al. (1988), CSB-D is most universally conserved segment among fish families, suggesting that it contains functions critical for mitochondrial metabolism. Although a number of different approaches have been tried (Mignotte et al., 1987), the function of this central conserved region is not understood. Next to CSB-D, GTGGG-box, which is common to euteleosts, has also been identified ( Figure S1). Compared to other teleost, all the three CSBs (CSB-1, CSB-II and CSB-III) were found similar within the genus Schizothorax. Comparing these to the available CSB sequences from representative teleost, we identified the following key features. CSB-1 has an A/T rich region followed by a similar TCAAGTGCATA motif. The motif of high similarity (CSB-1) seemed to be commonly distributed among fish species. Unlike other studies, CSB-3 is found more conserved than CSB-2 within the family Cypernidae. CSB-2 is characterized by a poly-C stretch separated by TA and CSB-3 containing a sequence of three ''A''s followed by a poly-C stretch. These CSBs are thought to be involved in the formation of proper RNA primer for mtDNA replication and play a role in the switch from RNA to DNA synthesis that commences at O H (Clayton, 1991) In addition, we found three TAS (Termination Associated Sequence) motif -TACAT -in 5 0 part of control region and one in the central domain ( Figure S1). Brzuzab & Ciesielski (2002), however, reported all the four similar TAS mitofs in 5 0 D-loop of three coregonine species. Up to three termination associated sequences (TASs) involved in the premature termination of the H-strand replication (Doda et al., 1981) as well as several copies of the conserved palindromic motif, 5 0 -TACAT-3 0 (Saccone et al., 1991) were found at the 5 0 end of the control region. However, no secondary structures were identified in the left region of D-loop. We also identified a pyrimidine block of 15 bp, between CSB-D and CSB-1. A 15 bp pyrimidine block was identified in the Atlantic cod (Johansen et al., 1990) and 26 blocks in the rainbow trout (Digby et al., 1992). This site may provide a point of interaction with mtDNA single stranded-binding protein (Digby et al., 1992) The sequence divergence and mutation analysis between the species was calculated using PAUP. The D-loop was found AT rich and 5 0 end was found conserved than 3 0 end. The mean uncorrected pairwise genetic distance between species was 20.22%. Of 789 total characters, 638 (80%) characters were constant, 89 (11%) variable characters were parsimony uninformative, and 62 (7.8%) variable characters were parsimony informative. The proportion of invariable sites was estimated to be zero and the shape of the gamma parameter was estimated to be 0.573. The estimated transition/transversion bias (R) was ¼ 4.44. The ML, NJ and BI methods of analysis generated almost identical relationships. The likelihoods of trees, optimized under the topologies obtained by the ML, MP and BI methods did not differ significantly. Because of this congruence, only the BI tree is presented with posterior probabilities following maximum likelihood bootstraps (Figure 1). The best-fit model for the D-loop data was GTR + G + I. The model contained the nucleotide substitution estimated nucleotide frequencies A ¼ 0.32, C ¼ 0.19, G ¼ 0.14, and T ¼ 0.33; nucleotide; nucleotide substitution rate matrix A-C ¼ 0.40, A-G ¼ 4. 6, A-T ¼ 0.44, C-G ¼ 0.21, and C-T ¼ 2.6.
The resulted BI tree (Figure 1) clearly indicates two different groups of S. esocinus and S. plagiostomus populations. One group of population nests with S. labiatus and shows a close relationship with S. niger and S. curvifrons. Other groups of S. esocinus and S. plagiostomus populations clustered separately. The result based on the D-loop also supports the high genetic similarity between S. niger and S. curvifrons species. The finding that some populations of S. plagiostomus and S. esocinus cluster together may indicate hybridization between the species. The lack of difference in the mitochondrial sequence data of Schizothorax species may be explained in terms of introgressive hybridization, incomplete lineage sorting, rapid radiation in lineages and multiple hits (homoplasy) (He & Chen, 2006). Silas (1960) reported that the inter-specific hybridization in nature takes place to a greater extent among the Schizothoracinae and easily recognizable combinations of hybrids resulting from Schizothorax esocinus Â Schizothorax spp. and Oreinus plagiostomus Â Schizothorax spp. Since Heckel (1838) erected the genus Schizothorax (species with four barbels and with or without labial papillae described from the Kashmir Valley), the taxonomy of the group has been confused and disputed. Silas (1960) further reported that the differences between S. curvifrons and S. niger are not sufficiently distinct to justify their treatment as separate species; therefore, the latter is treated as a subspecies of S. curvifrons. The BI phylogenetic tree (Figure 1) depicts that S. niger species might have evolved from the S. curvifrons relatively recent. Our results based on D-loop analysis support the complexity of relationship within schizothoracine species found India.
This study suggests that the length variation in microsatellite di-nucleotide (TA) n repeats at the 3 0 end of D-loop may be a potentially informative molecular marker for studying population structure and genetic diversity as well as conservation practices of fish species. Further, the presence of genus-specific conserved sequence blocks (CSB) might be usual for identification and explain different conversation mechanisms of these elements in evolution. Investigating more schizothoracine populations might therefore further elucidate the mutation mechanisms and The CSB sequences for superscript 1 and 2 were reported by Brzuzab & Ciesielski (2002) and asterisk represents the Genbank sequences (Schizothorax macropogon, accession no. KC020113: Cyprinus carpio carpio, accession no. JN105352). DOI: 10.3109/19401736.2014.945581 complexity of relationship for this geographically isolated area to lead more conclusive results. The use of more mitochondrial and nuclear markers for each the species could help resolve the species relation.