Distribution and genetic diversity of cobitid species in Iran (Teleostei: Cobitidae)

Several species of Cobitis have been reported within Iran, but only four are recognized as valid. The geographical distribution and genetic diversity of these species are not well-known. Two new populations of C. saniae and C. avicennae were reported herein based on their morphological and phylogenetic characters from the Mahabad River, a tributary of the Urmia Lake basin, and the Razavar River, a tributary of the Karkheh drainage, respectively. The genetic diversity and genetic structure of populations of C. saniae, C. avicennae, and C. faridpaki were analysed using the DNA sequences of the mitochondrial COI gene. COI sequences from various watersheds in Iran, Azerbaijan, and Georgia were used. A moderate to high level haplotype diversity and low nucleotide diversity were found in most of the populations of all three species. The AMOVA test and pairwise FST comparison revealed significant genetic structure and genetic distance between the Urmia Lake basin and other C. saniae populations. The results highlight the importance of conserving these species and their habitats, particularly in the Urmia Lake basin, where significant genetic differentiation was observed.


Introduction
The genus Cobitis (Linnaeus, 1758) of the family Cobitidae is widespread throughout the freshwaters of the Palaearctic region.It has been found in Eurasia, from the Far East to North Africa (Mousavi-Sabet et al., 2011;Kottelat, 2012).A total of 31 species of the genus Cobitis have been recognized as valid in the Middle East (Çiçek et al., 2024).In Iran, several species of this genus have been reported from, but only four are recognized as valid (Sayyadzadeh & Esmaeili, 2024).Cobitis linea (Heckel,1849) was originally described from the Kor River system around Persepolis, Iran (Bianco & Nalbant, 1980).Two other species were recently described in the southern part of the Caspian Sea basin.Cobitis faridpaki Mousavi-Sabet, Vasil'eva, Vatandoust & Vasil'ev, 2011 was originally described from the southeastern Caspian Sea basin in the Siahrud River and was later reported in the Namak Lake basin (Eagderi et al., 2017a).Cobitis saniae Eagderi, Jouladeh-Roudbar, Jalili, Sayyadzadeh & Esmaeili, 2017 is another valid species in the south part of the Caspian Sea basin.It was described from the southwest portion of the Caspian Sea basin in the Bara Goor River, a tributary of the Sefidrud River (Eagderi et al., 2017b).This species was then reported from the Urmia Lake basin in the Zarineh River (Eagderi et al., 2020).Cobitis saniae was also reported from the Kura River drainage in Georgia and Azerbaijan, the west part of the Caspian Sea basin in Azerbaijan (Vasil'eva et al., 2020), the Aras River drainages in Azerbaijan, Armenia, Iran, (2020) for the population from the Zarineh River, a tributary of the Urmia Lake basin; and Vasil'eva & Vasil'ev (2020), for the populations from the western basin of the Caspian Sea in Azerbaijan (Malyi Kyzylagach Bay, Alvady and Vilyashchay Rivers).The comparison of morphometric features between populations was conducted based on the range of each characteristic for both male and female sexes.The description of lines and color pattern on the dorsolateral side of the trunk followed the method described by Gambetta (1934).The Gambetta zones begin with Z1, a thin line just below the mid-dorsal blotch row, and end with Z4, the middle lateral blotch row.Molecular analysis.DNA was extracted from muscle tissue using the salting-out protocol (Cawthorn et al., 2011).The barcoding fragment of the mitochondrial cytochrome c oxidase subunit I (COI) was amplified using FishF1 and FishR1 primers (Ward et al., 2005).The resulting fragments were purified using a Promega DNA purification kit and sequenced using the Sanger bidirectional method, performed by Topaz Gene Company, Iran.After editing, they were deposited in NCBI GenBank with their corresponding accession numbers.COI sequences from related cobitid species and those from other reported populations of Iranian cobitid species were retrieved from the GenBank database (Table S1).Sequences were aligned using the ClustalW program running in MEGA version 11 (Tamura et al., 2021) and trimmed to an equal length of 550 bp.Maximum Likelihood (ML) and Bayesian Inference (BI) methods were used to construct phylogenetic trees.Sequences were aligned, and the best model for nucleotide substitution was determined using jModelTest 2.0 software (Santorum et al., 2014).The GTRGAMMA model was applied to construct the ML tree using RAxMLGUI v2.0.9 software (Edler et al., 2021), while MrBayes 3.1 software was used for Bayesian inference (Huelsenbeck & Ronquist, 2001).Markov chains were used for 5,000,000 generations, and log-likelihood stability was achieved after 10,000 generations.The first 1000 trees were excluded as burn-ins.The remaining trees were then used to compute a 50% majority-rule consensus tree.Trees were visualized in FigTree v. 1.4.2, and the K2P model was used to obtain intraspecific and interspecific distances between sequences and those retrieved from NCBI using MEGA 11 software.To delimit genetic clusters of newly described Zoology in the Middle East 123 cobitid populations, the automatic partitioning (ASAP) method (Puillandre et al., 2021) was used through its online version (https://bioinfo.mnhn.fr/abi/public/asap/asapweb.html).
The genetic diversity and genetic structure of C. saniae were analysed using three newly generated COI sequences and 44 COI sequences from the GenBank database (Table S1).According to the geographical area from which the specimens of these sequences were collected, they were analysed in four areas (Figure 1).In the western basin of the Caspian Sea, 27 COI sequences were analyzed from Lankaran, Masalla, Astara, and Jalilabad regions of Azerbaijan, as well as three COI sequences from the Iranian part of the Astara region and the Aras basin.These sequences were assigned to the C. saniae population in the Western Caspian Sea (CSWCS).Six COI sequences were used from the upstream regions of the Kura basin in Azerbaijan and Georgia.These sequences were assigned to the C. saniae population in the Kura basin (CSKB).Six COI sequences were also obtained from the southwestern region of the Caspian Sea basin, specifically in the Sefidrud and Talesh-Mordab sub-basins.These sequences were assigned to the C. saniae population in the Sefidrud-Talesh sub-basin (CSSTB).Finally, three new COI sequences from the Mahabad River and two from the Zarineh River have been assigned to the C. saniae population of the Urmia Lake Basin (CSULB).Genetic diversity and structure analysis of C. avicennae was further assessed by incorporating three newly generated COI sequences of C. avicennae from the Razavar River (CAR) population and three COI sequences from the Javanrud River (CAJ) population, both from the Tigris River basin.Seventeen COI sequences of C. faridpaki were also used for the genetic population analysis of this species.Fourteen COI sequences were used from the rivers Babolrud, Tajan, Siahrud and Keselian, all from the southeast of the Caspian Sea basin in the Haraz-Neka sub-basin and assigned to the population C. faridpaki southeast of the Caspian Sea basin (CFSCS).In addition, three COI sequences from the Karaj River drainage area in the Namak Lake basin were used and assigned to the C. faridpaki Namak Lake basin (CFNLB) population.Fu and Li's D (Fu and Li, 1993) and Tajima's D (Tajima, 1989) statistics were calculated for each species to evaluate the null hypothesis of selective neutrality test using DnaSP 6.0 (Rozas et al., 2017).Genetic diversity parameters within the population, including the number of polymorphic sites, number of haplotypes, number of singular variable sites, number of parsimony informative sites, haplotype diversity and nucleotide diversity, were calculated using DnaSP 6.0 (Rozas et al., 2017).Interpopulation genetic parameters were also estimated based on pairwise comparisons of average number of nucleotide differences, average number of nucleotide substitutions per site, coefficient of gene differentiation (Gst), and number of migrants per generation (Nm = (1-Gst)/2×Gst), using DnaSP 6.0 (Rozas et al., 2017).Analysis of Molecular Variance (AMOVA) was also conducted using Arlequin version 3.5 (Excoffier & Lischer, 2011) to test for the presence of population structure and to estimate population differentiation.The pairwise genetic distance was calculated based on the FST values with 1000 permutations.The exact test of population differentiation was also calculated based on the haplotype frequencies information and their significance was tested with 10000 Markov chain steps.

Results
Morphological descriptions.The general shapes of C. saniae from the Mahabad River and C. avicennae from the Razavar River, as well as the shapes of Gambetta's pigmentation zones, are presented in Figures 2 and 3. Cobitis saniae.The Cobitis specimens collected from the Mahabad River possess the following key features as diagnostic signs of C. saniae (Eagderi et al., 2017b;Freyhof et al., 2018): the males have a single lamina circularis on the pectoral fin that is widely connected to the pectoral fin ray.There is a large black spot on the upper caudal-fin base, elongated scales with a small eccentric focal zone in the sub-dorsal region, and pigmentation on the cheek between the eye and operculum.The pattern of lateral pigmentation zones in the Mahabad population (Fig. 3A) was found to be slightly different from those described by Eagderi et al. (2017b).The mid-dorsal pigmentation is characterized by 15-19 large, distinct, dark-brown blotches, which are regularly shaped (as opposed to 13-19 often fused and irregularly shaped blotches found in the southern Caspian Sea basin).Zone Z1 is short and comprises small spots, extending to the base of the dorsal fin base in juvenile specimens and to the midpoint between the dorsal and caudal fin bases in adults (unlike in the southern Caspian Sea basin where it reaches the caudal-fin base).Zone Z2 features 15-21 elongated blotches arranged in a strip that extends to the base of the caudal fin, while zone Z3 is composed of small spots that poorly develop and only reach the dorsal-fin base (in contrast to reaching the anal-fin base in the southern Caspian Sea basin).Lastly, zone Z4 consists of 13-15 distinct darkbrown blotches (compared to 13-23 blotches in the southern Caspian Sea basin).The morphometric characteristics of the C. saniae population from the Mahabad River are given in Table 1.Most morphometric characteristics have overlapped with those reported for this species from the southern Caspian Sea basin and the Zarineh River populations in Iran (Eagderi et al., 2017b;Eagderi et al., 2020), as well as three samples from the western basin of the Caspian Sea in Azerbaijan (Vasil'eva & Vasil'ev, 2020).There is, however, some variability between the Mahabad River and southern Caspian Sea basin populations.The Mahabad River population has a slender caudal peduncle, shorter postdorsal length, longer dorsal and anal fins base lengths, and smaller head depth and width at the nape.Morphometric variability is also observed between the Mahabad   The average nucleotide frequencies in 550 bp COI fragments of C. saniae, C. avicennae, and C. faridpaki were as follow: thymine (33.6; 31.9;33.1%), cytosine (25.5; 26.4; 25.6%), adenine (24.0; 25.3; 24.7%), and guanine (17.0; 16.4; 16.6%) for each species, respectively.There were no deviations from mutation-drift equilibrium as revealed by neutrality tests for all species.All populations of three species showed low  3).Among the 47 sequences of C. saniae, six sequences of C. avicennae, and 17 sequences of C. faridpaki, seven, four, and seven haplotypes are defined, respectively.The intrapopulation genetic diversity analysis of four C. saniae populations shows that the CSWCS and CSSTB populations from the west and southwest regions of the Caspian Sea basin have lower haplotype and nucleotide diversity than the CSKB and CSULB populations from the upstream regions of the Kura River basin and the Urmia Lake populations (Table 3).The total haplotype diversity and total nucleotide diversity within this species are 0.561 and 0.00176, respectively.Two populations of C. avicennae, one from Javanrud and the other from Razavar, were compared.The CAJ population exhibited identical sequences, whereas the CSR population demonstrated high haplotype diversity (Table 3).High haplotype diversity is also observed within the CFSCS and CFNB populations of C. faridpaki from southeast of the Caspian Sea basin and the Namak Lake basin, respectively (Table 3).Several methods were used to analyse the genetic diversity and structure among populations of these species.In C. saniae populations, the pairwise coefficients of gene differentiation (Gst) and migrants per generation (Nm) indicated a low to moderate level of gene differentiation and a moderate to high number of migrants per generation (Table 4).Specifically, the CSSTB-CSKB and CSKB-CSWCS pairwise populations showed low levels of gene differentiation and high levels of migrants per generation, while the remaining populations displayed moderate levels.AMOVA analysis revealed that 37.39% of the genetic variation in C. saniae populations occurred between populations, and 62.61% within populations (Table 5).The average F ST values of 0.37 (P < 0.0001) suggested a strong genetic structure among C. saniae populations.The pairwise F ST genetic distance analysis revealed significant genetic differences between the C. saniae population of the Urmia Lake basin and those from the southwestern region of the Caspian Sea basin, the western Caspian Sea basin, and the Kura River basin (Table 4).Significant genetic distance was also observed in pairwise comparisons between the Kura River basin and the western Caspian Sea basin populations.However, no significant genetic distances were found between the population from the southwestern region of the Caspian Sea basin and those from the Kura River basin and the western Caspian Sea basin populations (Table 4).The exact test for population differentiation based on haplotype frequencies, with a Markov chain length of 10,000 steps, revealed no significant genetic differentiation among these populations (P=1.0).
The two populations of C. avicennae showed a moderate level of gene differentiation and a low number of migrants per generation, as indicated by their Gst and Nm values (Table 4).The AMOVA test indicated that 76.92% of the total genetic variation was among populations, while 23.08% was within populations (Table 5).However, neither the AMOVA result (P>0.05)nor the exact test of population differentiation (P=1.0)found evidence of genetic variation between the two populations.
The two populations of C. faridpaki showed a low level of Gst and a high level of Nm values (Table 4).The AMOVA results indicated that genetic variation was minimal among populations (5.07%), with the majority of genetic variation (94.93%) occurring within populations (Table 5).Furthermore, pairwise F ST comparisons did not reveal any significant genetic distance or population differentiation between these populations.

Discussion
A thorough understanding of the geographical distribution of freshwater fish populations can be beneficial for a number of reasons.This information can be used to determine the population size, genetic diversity, habitat preferences, and species richness of freshwater fishes in a region (Rosenfield, 2002;Griffiths, 2006).It also aids in understanding conservation status and informing management decisions.Two new sites were reported for C. saniae and C. avicennae.Phylogenetic and morphological analyses showed these populations belong to C. saniae and C. avicennae, inhabiting the Mahabad River and Razavar River, respectively.While these populations exhibited most of the same morphological and morphometric characteristics as those reported for these species, some phenotypic variability was observed.Interpopulation phenotypic variability has been reported for morphological characters and colouration features in the key features of C. saniae populations from the western basin of the Caspian Sea in Azerbaijan, the Kura basin in Azerbaijan and Georgia, and the Sefidrud and Talesh-Mordab sub-basins in Iran (Vasil'eva & Vasil'ev, 2020).However, we could not find rounded shape scales in the subdorsal region or the subcutaneous dark spot in the lower part of the caudal-fin base in C. saniae from the Mahabad River population.We also could not find C. saniae specimens with Z1 and Z3 reaching the caudal-fin base as reported in some samples of this species (Vasil'eva & Vasil'ev, 2020).Interpopulation variability was also observed in some of morphometric measurements between C. saniae populations from the Mahabad River and those from the southern Caspian Sea basin and Zarineh River populations in Iran (Eagderi et al., 2017b;Eagderi et al., 2020), as well as those reported by Vasil'eva & Vasil'ev (2020) from Azerbaijan populations.Among several morphometric differences across C. saniae populations, the Mahabad River population exhibits specific variations compared to other regions.Specifically, it is distinguished from both the southern Caspian Sea basin and the Zarineh River populations by having less caudal peduncle depth and less head width.Additionally, it stands out from two populations in the western basin of the Caspian Sea in Azerbaijan due to its wider interorbital distance.There was also a difference in the postdorsal length between our data and the data reported by Eagderi et al. (2017b) for the southern Caspian Sea basin population.However, this difference appears to be due to an overestimation of this trait in the southern Caspian Sea basin population.This overestimation has been reported previously when comparing the populations of the western Caspian and the southern Caspian Sea basins (Vasil'eva & Vasil'ev, 2020).Despite extensive efforts to collect additional specimens of C. saniae in the Mahabad River, we were only able to obtain a limited number of specimens.Therefore, although the range of the dorsal fin depth and the length of the pectoral fin in males are greater than the values of these traits in females, the limited number of female specimens does not allow for a meaningful comparison of sexual dimorphism.
We also noted morphometric variabilities between the C. avicennae population from the Razavar River and those from the Gamasiab population, albeit in a limited number of measurements.These meaningful variations were in the body depth and the caudal peduncle depth.
The morphological differences in C. saniae and C. avicennae populations may be due to phenotypic plasticity, influenced by their varying habitats (Fusco & Minelli, 2010;Kelley et al., 2017).This adaptability is crucial for survival, as it allows species to respond to changes in their environment, enhancing their survival prospects (Xue & Leibler, 2018).While there was a relatively higher phenotypic diversity among C. saniae populations, a limited number of phenotypic variations were found among C. avicennae populations.Cobitis saniae is a species with a relatively wide distribution range, and found in a large geographical area compared to C. avicennae.It is found in the southern and western Caspian Sea basins in Iran and Azerbaijan, the Kura-Aras system in Iran, Azerbaijan, Armenia, Georgia, and Turkey, the Rioni River in the Georgian Black Sea basin, the Zarineh River (Eagderi et al. 2017b(Eagderi et al. , 2020;;Freyhof et al. 2018;Vasil'eva & Vasil'ev, 2020), and is now also known to occur in the Mahabad River in the Urmia Lake basin in Iran.In contrast, C. avicennae is found in limited locations in the Karkheh and Sirwan Rivers (Mousavi-Sabet et al., 2015;Freyhof et al., 2018), and Edris Ghaderi et al. is now also known to occur in the Razavar Rivers, all within the Tigris basin in Iran.As a result, C. saniae, with a wider range distribution, could have a higher intraspecific phenotypic variability, so that it can better adapt to a variety of habitats.
The percentages of each nucleotide in the COI fragments of C. saniae, C. avicennae, and C. faridpaki were relatively similar.This suggests that these three species are closely related to each other genetically.Despite the presence of one population with identical sequences in C. avicennae, there was moderate to high level of haplotype diversity, and low nucleotide diversity in each population of Cobitis species.There are few reports regarding the genetic diversity of Cobitis species.Using the mitochondrial COI gene, Buj et al. (2008) Buj & Šanda, 2014, C. vettonica Doadrio & Perdices, 1997, and Sabanejewia balcanica (Karaman, 1922) from their natural geographical distributions (Bajrić et al., 2021;Buj et al., 2015;Corral-Lou et al., 2022).However, Papoušek et al. (2008) reported low overall haplotype diversity and a high number of nucleotide variations within the populations of C. elongatoides from Czech and Slovak waters using this marker.Aside from the differences between the markers and the species studied, the high haplotype diversity and low nucleotide diversity within the Iranian species analyzed suggests that the populations of these three species of Cobitis have diverged recently (Song et al., 2014, Xu et al., 2014) and experienced a genetic bottleneck followed by rapid growth and the accumulation of mutations among populations of these species (Grant & Bowen, 1998).It is apparent from the phylogenetic tree that multiple subclusters exist in all three studied species, supporting the conclusion that these species have diverged recently.According to the total haplotype diversity within each species, C. avicennae exhibits greater variability than the other species.The haplotype diversity of this species is measured using data from two populations, the Javanrud River and the Razavar River.In the Razavar population, haplotype diversity was high.
Interpopulation genetic diversity and genetic structure indices among populations of each species show low to moderate levels of divergence, except for some pairwise populations of C. saniae.A recent population expansion may contribute to low levels of genetic variation and differentiation among populations of these species.Among comparisons based on F ST genetic distance values of the C. saniae population, the Urmia Lake Basin displayed significant genetic distance from other populations of this species.Until recent periods (Holocene or Pleistocene), the Urmia Lake was connected to the Caspian Sea by a water canal through the Aras River, whose remains are still visible (Reichenbacher et al., 2011).Some zoogeographic evidence indicates that the fishes of the Urmia Lake basin originated from the Aras River in the west of the Caspian Sea (Armantrout, 1980).The Urmia Lake, however, is an endorheic, hypersaline water body that serves as a natural barrier between the two basins, so the fish fauna seems to have diverged since the basin has been divided.There has also been a report of a strong genetic structure between Silurus glanis populations, a common species that occurs in both basins (Bahrami Kamangar & Rostamzadeh, 2015).However, Cobitis species are smaller benthic fishes with lower migration abilities, which makes them more susceptible to any genetic divergence among populations.The significant genetic distance between the Urmia Lake and other C. saniae populations can be attributed to the fact that while all populations have recently expanded, the Urmia Lake population remained more isolated, and thus exhibit a significant genetic distance.A small but significant genetic distance was also found between the populations of C. saniae in the western Caspian Sea basin and the Kura basin, based on the F ST value.The majority of COI sequences assigned to the western Caspian Sea basin population were obtained from a study conducted by Vasil'eva et al. (2020), using samples collected in Lankaran, Masalla, Astara, and Jalilabad regions of Azerbaijan.These regions fall within the Lesser Caucasian hydrogeological basin.In contrast, the population in the Kura basin is defined by COI sequences obtained from the Kura depression hydrogeological basin.Both basins exhibit unique zoogeographic patterns (Naseka, 2010), influenced by their geological context, climate, and hydrological features, which could impact the genetic distance of this common species.
The new populations of C. saniae and C. avicennae were described within their respective range distributions.Additionally, genetic diversity and genetic structure of Iranian Cobitis were examined within and among the most described populations.Despite the relatively small number of specimens examined for some samples, the results suggest that these populations have recently expanded.Additionally, they indicate the existence of isolated populations of these species.The vulnerability of certain Iranian Cobitis populations is exacerbated by a combination of factors, including a moderate genetic diversity.In addition, there are ongoing environmental changes within their habitats.Recent droughts, overexploitation of running and underground water resources, and the introduction of domestic and agricultural sewage into the rivers where these species are found in Iran have all contributed to their heightened vulnerability.Due to the significant genetic structure between C. saniae population from the Urmia Lake and other populations of this species, it may be imperative to prioritize conservation efforts in these regions.As a result, C. saniae will remain genetically diverse.It is necessary to conduct further research in order to confirm these findings and learn more about what drives genetic variation within this species.

Figure 2 .
Figure 2. Lateral and dorsal views of Cobitis saniae (A and B) from the Mahabad River, Urmia Lake basin, and Cobitis avicennae (C and D) from the Razavar River, Karkheh River basin.

Figure 3 .
Figure 3.The Gambetta's pigmentation zones in the males of Cobitis saniae (A) from the Mahabad River, Urmia Lake basin, and Cobitis avicennae (B) from the Razavar River, Karkheh River basin.

Figure 4 .
Figure 4. Phylogenetic relationships of Cobitis saniae and C. avicennae from the Mahabad River (CSUL) and Razavar River (CAR) with COI sequences retrieved from the GenBank database.The bootstrap values for Maximum Likelihood (10,000 replicates) and posterior probabilities for Bayesian Inference are represented on the branches, respectively.The species delimitation results based on the ASAP method are shown as a gray bar on the right of the tree.

Table 3 .
Intrapopulation genetic variability parameters of Cobitis saniae, C. avicennae, and C. faridpaki populations.N = number of sequences; S = number of polymorphic sites; Si = number of singleton variable sites; Pa = number of parsimony informative sites; h = number of haplotypes; Hd = Haplotype diversity; Pi = Nucleotide diversity; SD = Standard deviation.