DNA barcoding and phylogenetic relationships in Anatidae

Abstract Mitochondrial cytochrome c oxidase subunit I (COI) has been used as a powerful marker in a variety of phylogenetic studies. According to studies of bird species, the 694-bp sequence of the mitochondrial gene encoding COI is extremely useful for species identification and phylogeny. In the present study, we analyzed the COI barcodes of 79 species from 26 genera belonging to the Anatidae family. Sixty-six species (83.54%) of the species were identified correctly from their DNA barcodes. The remaining 13 species shared barcodes sequences with closely related species. Kimura two-parameter (K2P) distances were calculated between barcodes. The average genetic distance between species was 41 times higher compared to the average genetic distance within species. Neighbor-joining method was used to construct a phylogenetic tree, which grouped all of the genera into three divergent clades. Dendrocygna and Nomonyx + Oxyura were identified as early offshoots of the Anatidae. All the remaining taxa fell into two clades that correspond to the two subfamilies Anserinae and Anatiane. Based on our results, DNA barcoding is an effective molecular tool for Anatidae species identification and phylogenetic inference.

A DNA barcode is a short sequence of standardized genomic region of mtDNA that is specific to a species. A certain fragment of the mitochondrial gene COI, coding for a subunit of the enzyme cytochrome oxidase, has become widely known and used as ''the DNA barcode'' for the identification of most animal species (Breman et al., 2013;Cai et al., 2010;Hebert et al., 2003aHebert et al., ,b, 2004a. Also, it has performed successfully in the identification of the phylogeny of many animal groups, including birds (Hebert et al., 2004b;Johnsen et al., 2010;Kerr et al., 2007Kerr et al., , 2009Yang et al., 2014;Yoo et al., 2006).
To date, DNA barcoding studies on Anatidae birds remain limited. In the present study, we examined the 694 bp of the COI gene of Anatidae birds and then conducted phylogenetic analyses within Anatidae-based on these sequences, which will improve our understanding of evolution biology of this animal group.

Material and methods
Two hundred fifteen COI sequences were obtained from the GenBank. A total of 79 species from 26 genera belonging to the family Anatidae were analyzed (Table S1).
Sequences were aligned by the Clustal X procedure (Thompson et al., 1997). A total of 694 bp of the mtDNA COI genes were analyzed. DnaSP v5.0 (Librado & Rozas, 2009) was used to define the variable sites. Sequence divergence among species and genera were calculated using the Kimura twoparameter (K2P; Kimura, 1980) distance model in MEGA 6.0. Neighbor-joining method (Saitou & Nei, 1987) were used to reconstruct the phylogenetic tree based on using K2P in MEGA 6.0 (Tamura et al., 2013). Node support was assessed using the bootstrap method (Felsenstein, 1985).

Results
Four-hundred and twenty variable sites were identified, of which two-hundred and fifty-six were parsimoniously informative (36.89% of the entire sequence). About 83.54% (66 out of 79) of the species were identified correctly from their DNA barcodes. The remaining 13 species were not monophyletic in their COI sequences because they shared barcodes sequences with closely related species. In seven species pairs, both species shared a haplotype: viz. Anas penelope and A. falcata, A. sibilatrix and A. flavirostris, A. platyrhynchos and A. poecilorhyncha, A. rubripes and A. poecilorhyncha, A. discors and A. cyanoptera, Somateria spectabilis and S. mollissima, Anser rossii and A. caerulescens.
K2P genetic distances within-species had a small range (0 to 2.57%), with more than 93.22% of the observations below 1% genetic distances (Figure 1). Pair-wise comparisons amongspecies were distributed from 0 to 19.26% with most of the comparisons observed between 8-15% K2P genetic distance, up to 79.75% (Figure 1). The average difference in the COI sequence between species (11.19%) was 41 times higher compared to the average difference within species (0.27%).
Most species could be discriminated by their distinct clusters in the phylogenetic trees ( Figure S1). The phylogenetic tree grouped each genus species into alone cluster. Dendrocygna was the first to split from the Anatiadae lineage. Monophyly of the Nomonyx + Oxyura clade was well supported with high bootstrap values (BP ¼ 97%). Analysis of COI genes supported the others genera fell into two clades corresponding to the two subfamilies Anserinae and Anatiane.

Discussion
The results of the present study demonstrated the discriminatory power of COI barcodes for the identification of Anatidae species, with 83.55% of currently recognized species displaying unique barcodes. The identification success rate was little lower than the success rate reported in other DNA barcoding studies in birds (Breman et al., 2013;Johnsen et al., 2010;Kerr et al., 2007;Tavares & Baker, 2008;Yoo et al., 2006). Hybridization and incomplete lineage sorting are two important evolutionary processes that often confound phylogenetic inference (McGuire et al., 2007). The thirteen species that shared barcodes with closely related species in sympatry likely experienced hybridization. Hybridization is widespread in birds (McCarthy, 2006). Anseriformes showed the greatest propensity to hybridize, with an incidence approaching one out of every two species among the orders of birds (Grant & Grant, 1992). Hybridization presents challenges to identify individuals and reconstruct bird phylogenies.
Anatidae are a well-studied group of birds; however, some aspects of their evolutionary relationships have remained unclear.
Here, we provide the first phylogenetic analysis for Anatidae using the COI gene. Analysis of COI sequences produced a wellsupported phylogeny for Anatidae. The phylogenetic tree supported a basal position of Dendrocygna in Anatidae ( Figure S1). Traditionally, members of Dendrocygna were considered to be Anserinae-based on morphological and behavioral characters (Delacour & Mayr, 1945;Del Hoyo et al., 1992;Johnsgard, 1978). However, some proposed that Dendrocygna represented an independent lineage, unrelated to Anserinae (Livezey, 1997;Sibley & Ahlquist, 1990). COI genes data indicated that members of Dendrocygna emerged to be the sister groups of the outgroup ( Figure S1). It was therefore likely that the genus Dendrocygna was not a member of the Anserinae, but rather had a basal ancestor of Anatidae. Some molecular analyses had also supported this basal placements, including allozyme data (Numachi et al., 1983), DNA-DNA hybridization (Sibley et al., 1988;Sibley & Ahlquist, 1990), and mtDNA sequence (Donne-Goussé et al., 2002;Sorenson et al., 1999). Similarly, members of Oxyura and Nomonyx were considered to be Anatiane based on morphological and behavioral characters (Delacour & Mayr, 1945;Johnsgard, 1978;Del Hoyo et al., 1992). COI analysis supported that the genera Oxyura and Nomonyx formed an independent grouping. Our molecular results suggested that the two genera diverged from other Anatidae earlier than the Anatinae/Anserinae split. Within Anatidae except above-mentioned three genera, COI genes analysis supported the conventional division between Anatiane and Anserinae. This basal dichotomy was also observed by other molecular studies (Donne-Goussè et al., 2002;Sorenson et al., 1999).
Within Anserinae, the present analysis succeeded in producing a completely trichotomous phylogenetic tree, most of the included nodes have robust support. The position of Coscoroba has been much disputed (Donne-Goussé et al., 2002). Traditionally, Coscoroba and Cygnus are considered to be sister genus based on behavioral characteristics and morphology (Del Hoyo et al., 1992;Livezey, 1997). Zimmer et al. (1994) proposed, on the basis of mtDNA srRNA sequence data, that Coscoroba might be the sister group of the geese and swans, as opposed to being the sister group of the swans alone (Johnsgard, 1978;Livezey, 1986). Recently, mtDNA control-region was shown that Coscoroba coscoroba and Cereopsis novaehollandiae were sister species (Donne-Goussé et al., 2002). Coscoroba was the first to split from the Anaserinae lineage in our analysis. COI gene data well supported the result of Zimmer et al. (1994).
Five clades are clearly recognizable among Anatinae ( Figure  S1), which did not fully support the existing tribes. Our analysis found that Netta grouped with Aythya and that Aix grouped with Cairina, which were in accordance with the results of other molecular analyses (e.g. Donne-Goussé et al., 2002;Sorenson et al., 1999;Sraml et al., 1996). Phylogenetic relationships of the tribe Anatini (dabbling ducks) remain controversial despite intensive study (Donne-Goussé et al., 2002;Johnson & Sorenson, 1998Livezey, 1991). Livezey (1991) suggested that the tribe Anatini included all of the dabbling ducks and many of the perching ducks (Anas, Lophonetta, Cairina, Aix, Callonetta and Chenonetta). COI genes analysis showed Anas was the sister groups of Lophonetta, Tachyeres and Amazonetta ( Figure S1). Johnson & Sorenson (1998 also found that Anas was not a monophyletic genus since it contained species of the genera Lophonetta, Amazonetta, Speculanas, and Tachyeres. The close relationship between Lophonetta and Anas was also found in morphological analysis since in some works the crested duck Lophonetta specularoides is called Anas specularoides (Donne- Goussé et al., 2002).
In conclusion, most Anatidae species have distinct COI sequences. DNA barcoding is an effective molecular tool for species identification and phylogenetic inference of Anatidae. However, to unambiguously resolve phylogenetic relationships of Anatida, more taxon sampling as well as multiple nuclear markers are needed for future studies.