Mitochondrial genome structure, phylogenetic analyses and substitution rate estimation of the Oedogoniales

Abstract The order Oedogoniales comprises three genera, Oedogonium, Oedocladium and Bulbochaete, which were classified based on traditional morphological criteria, and includes more than 600 described species. This group is economically important in astaxanthin production and the energy sector. However, only one mitochondrial genome (mitogenome) has been reported so far. This study determined the mitochondrial genomes of seven Oedogoniales species, including six Oedogonium species and Oedocladium prescottii. Comparative analyses between the newly determined mitogenomes and the previously reported Bulbochaete rectangularis var. hiloensis mitogenome showed that all eight mitogenomes comprised 12 protein-coding genes (PCGs) and two rRNAs; however, the mitogenomes differed in their genome sizes, GC content, tRNAs, non-coding regions and introns. Synteny analysis of the eight mitogenomes revealed a high degree of syntenic conservation in general, with some rearrangements and inversions. The average nucleotide identity (ANI) analysis of the eight mitogenomes indicated Oedogonium dentireticulatum showed high similarity with Oedogonium sp3 (ANI of 96.32%). Most of the PCGs of the eight mitogenomes presented the conventional start codon ATG and stop codon TAR (TAA/TAG/TGA), and the synonymous codon preferences were conserved. Phylogenetic results indicated that Oedogonium was polyphyletic, and species of Oedocladium clustered with Oedogonium, while the position of B. rectangularis var. hiloensis was uncertain for the incongruent phylogenetic results. Statistical analyses of substitution rates demonstrated no significant differences among the three genera, and the dN/dS ratios based on branch model showed that cob, cox1 and nad4 of Oedocladium prescottii and B. rectangularis var. hiloensis were putative fast-evolving genes. These findings suggested that the traditional taxonomy of Oedogoniales did not define natural groups, and that species of Oedocladium and Bulbochaete may have undergone rapid evolution.


Introduction
The order Oedogoniales includes three genera, Oedogonium Link ex Hirn, Oedocladium Stahl and Bulbochaete Agardh (Agardh, 1817;De Bary, 1854;Stahl, 1891;Hirn, 1900), which were classified based on traditional morphological criteria such as presence of branches and hairs (Hirn, 1900;Tiffany, 1937;Gemeinhardt, 1939;Islam & Sarma, 1963;Gauthier-Lièvre, 1964;Jao, 1979;Mrozińska, 1985;van Den Hoek et al., 1995;Graham & Wilcox, 2000;Liu & Hu, 2004). Species of Oedogoniales are economically important in the astaxanthin production industry and the energy sector. The genus Oedogonium was recently targeted for biomass applications due to its high productivity, favourable biochemical composition, global distribution and competitive dominance over other algal taxa (Lawton et al., 2014). Oedocladium species have great potential for use in aquaculture as they can produce large amounts of astaxanthin esters (Chen et al., 2019) which are commonly used in fish aquaculture as a pigmentation agent (Lorenz & Cysewski, 2000;Choubert et al., 2006;Li et al., 2014;Lim et al., 2017) or to increase antioxidant activity in fish (Liu et al., 2016;Lim et al., 2017) and can also support the human immune system (Jyonouchi et al., 1995). Previous studies showed that Oedocladium species used as feed ingredients could improve growth performance, tissue fatty acid profiles, pigmentation and immunity of the Gibel carp (Chen et al., 2019). However, molecular phylogenetic studies of the order Oedogoniales are still limited.
It has been reported that Oedogoniales are monophyletic and Bulbochaete may be deep-branching compared with the other two genera (Booton et al., 1998;Buchheim et al., 2001;Krienitz et al., 2003;Shoup & Lewis, 2003). By using nuclear 18S rDNA of 10 Oedogonium species, Alberghina et al. (2006) suggested that Oedogonium might not be monophyletic and that the morphological characteristics may not define the phylogenetic groups. Mei et al. (2007) employed 18S rDNA sequences and found that Oedocladium formed a separate clade within Oedogonium, whereas Bulbochaete was sister to the other two genera; however, they noted that this phylogeny required further investigation with broader sampling of these taxa, particularly Oedocladium and Bulbochaete. Xiong et al. (2021) indicated that Oedogonium was polyphyletic and Oedocladium was paraphyletic based on chloroplast genomes proteincoding genes (PCGs). With the application of nextgeneration sequencing technology, an increasing number of mitogenomes have been sequenced in recent years, and phylogenetic analysis based on mitogenomes has been widely used in different types of species, such as fish, parasites, brown algae, red algae and green algae (Hall et al., 2008;Durand & Borsa, 2015;Lee et al., 2018;Vanhove et al., 2018;de Sousa et al., 2019). However, only one mitogenome of the family Oedogoniaceae (i.e. mitogenome of Bulbochaete rectangularis var. hiloensis) is currently available in public databases (Turmel et al., 2020). The mitogenome, with a circular architecture, uses a standard genetic code, contains 41 conserved genes including 27 tRNA genes, 12 protein coding genes (PCG) and 2 rRNA genes.
In this study, we sequenced the mitogenomes of seven Oedogoniales species, including six Oedogonium species and Oedocladium prescottii, and conducted comparative genome analyses, phylogenetic investigations, and substitution rate estimation with the previously reported B. rectangularis var. hiloensis mitogenome. The study enriched the genetic information on the mitogenomes of this group, which will be helpful to better understand the phylogeny and evolution of this group.

Phylogenetic analyses
Phylogenetic analyses were performed by examining the sequences of PCGs and 18S rDNA of the Oedogoniales mitogenomes, and we used species of Chaetophorales as outgroup. Amino acid and nucleotide datasets of mitogenomes were assembled using the following 12 PCGs: atp9,cob,cox1,cox2,cox3,nad1,nad2,nad3,nad4,nad4L,nad5 and nad6. For the amino acid dataset, the genes were aligned using MAFFT 7.0 (Katoh & Standley, 2013), whereas for the nucleotide dataset, the genes were additionally aligned using the MUSCLE function of MEGA7 with the option "align codons" (Edgar, 2004;Kumar et al., 2016). Ambiguous regions were removed from each alignment using trimAl 1.2 (Capella-Gutierrez et al., 2009) with the option gt = 1. Evolutionary models and partitions of the datasets were determined using PartitionFinder 2 (Lanfear et al., 2017), and the best partitions are shown in Supplementary table S1. The 18S rDNA sequences were aligned using MAFFT 7.0 (Katoh & Standley, 2013), and ambiguous regions were manually edited and adjusted by eye using MEGA7 (Kumar et al., 2016). jModelTest2 was used to determine the 18S rDNA sequences and the best model was noted to be GTR+I+G (Darriba et al., 2012). DAMBE (Xia & Xie, 2001) was used to assess if the genes were experiencing substitution saturation. Supermatrix and coalescent-based analyses were used to construct phylogenetic tree. For the supermatrix analysis. Phylosuite (Zhang et al., 2020) was used to concatenate all of the genes. Maximum likelihood (ML) and BI (Bayesian inference) methods were used for inferring phylogenies. Phylogenetic calculations were performed using ML method on the IQ-TREE web server (Trifinopoulos et al., 2016) with 1000 ultrafast bootstraps (Bui Quang et al., 2013) and 1000 SH-aLRT tests (Guindon et al., 2010) to examine nodal support. Bayesian analyses were performed using MrBayesv3.2.6 (Ronquist et al., 2012), and the datasets of the mitogenome PCGs were partitioned as shown in Supplementary table S1. Markov chain Monte Carlo analyses were run with four Markov chains (three heated, one cold) for 1000000 generations, and trees were sampled every 1000 generations. In each round of calculation, a fixed number of samples (burn-in = 1000) was discarded from the beginning of the chain. For the coalescent-based analyses, a ML analysis of each gene was performed in RAxML v8.2.10 (Stamatakis, 2014) by conducting 1000 rapid bootstrap replicates with the General Time Reversible model with estimate of proportion of invariable sites and rate heterogeneity (GTRGAMMAI model). Then, the resulting best trees were used to infer the coalescence-based species tree phylogeny with Accurate Species Tree ALgorithm program (ASTRAL) v5.6.3 (Zhang et al., 2018).

Substitution rate estimation
After aligning and trimming each gene, synonymous and non-synonymous substitutions (dS and dN, respectively) were measured using CODEML program of the phylogenetic analysis by maximum likelihood (PAML) v4.9 (Yang, 2007). The ML model was used with the following options: runmode = −2 and CodonFreq = 2. The dS and dN of the 12 PCGs of the eight Oedogoniales mitogenomes were calculated. The eight Oedogoniales species were divided into three groups at the genus level, and pairwise comparison was performed to determine any significant difference among the three genera. Post.hoc.kruskal.nemenyi. test () was used for pairwise comparison of the dN and dS values, and multiple testing was corrected by applying false discovery rate method (FDR) (Benjamini & Hochberg, 1995). Branch model was employed to determine whether Oedocladium and Bulbochaete have different dN/dS ratios relative to Oedogonium, and Oecladium prescottii and Bul. rectangularis var. hiloensis were set as foreground branches. A null model (model = 0), where one dN/dS ratio was fixed across all strains, was compared with an alternative model (model = 2), where Oedocladium and Bulbochaete spp. were allowed to have a different dN/dS ratio. Likelihood ratio tests were used to examine model fit, Chi-square test was applied for analysing the p values, and multiple testing was corrected by FDR. The genes were considered as putative fast-evolving genes if they had an FDR-adjusted p < 0.05 and a higher dN/dS ratio in the foreground branch than in the background branches.

General features of the Oedogoniales mitogenomes and synteny analysis
The characteristics of the Oedogoniales mitogenomes are listed in Table 1. All the newly added mitogenomes showed the same typical circular mapping as that of B. rectangularis var. hiloensis with total length ranging from 47180 bp (Oedogonium crispum) to 110880 bp (Oedogonium capilliforme) (Supplementary figs S1-S7). The overall GC content in each mitogenome was comparable and presented little difference among the species, varying from 30.64% (Oedogonium crispum) to 41.06% (Oedocladium prescottii). Besides, the difference in the non-coding region varied from 79.89% (Oedogonium crispum) to 89.89% (Oedogonium capilliforme), showing a trend similar to that of the total length of the mitogenomes. All the eight Oedogoniales mitogenomes included the same 12 PCGs and two rRNA genes but showed a slight difference in the numbers of tRNA genes. For instance, Oedogonium dentireticulatum, Oedogonium crispum, Oedogonium sp3, and B. rectangularis var. hiloensis presented 27 tRNA genes, whereas Oedogonium capilliforme and Oedogonium sp2 lacked trnG (gcc) and Oedogonium sp1 and Oedocladium prescottii lacked trnL (caa), thus possessing 26 tRNA genes. Furthermore, the eight Oedogoniales mitogenomes showed differences in the intron content, with Oedocladium prescottii presenting the lowest level of introns (7 group I introns and 4 group II introns) and Oedogonium sp3 revealing the maximum introns content, including 19 group I introns and 3 group II introns. All eight species contained group I introns and group I introns in cox1, expect for Oedogonium crispum and Oedogonium capilliforme, the group I introns in cox1 contain an open reading frame (ORF) encoding putative LAGLIDADG homing endonucleases.
ProgressiveMauve was used to analyse the Oedogoniales mitogenomes synteny, with Oedocladium prescottii used as a reference to compare the gene order among the mitogenomes (Fig. 1). A total of 7-9 locally collinear blocks were identified in the mitogenomes of the eight Oedogoniales species. A relatively high degree of syntenic conservation, with some rearrangements and inversions, was observed among the eight Oedogoniales mitogenomes. In the mitogenome of Oedogonium sp3, the sixth and seventh blocks showed rearrangements and inversions, with the sixth block including nad2 and the seventh block comprising 12 tRNAs, 4 ORFs, nad4L and nad6. The other small blocks in Oedogonium sp3, Oedogonium. sp2, Oedogonium capilliforme and B. rectangularis var. hiloensis mainly comprised the ORFs. Subsequently, the ANI of the eight mitogenomes was calculated using FastANI (Supplementary fig. S8), and the results indicated Oedogonium dentireticulatum showed high similarity with Oedogonium sp3 (ANI of 96.32%).

Codon usage analysis of PCGs
Most of the PCGs were found to use the conventional start codon ATG, except for cox2 of Oedogonium crispum and Oedocladium. prescottii and nad1 of Oedogonium sp1, which presented start codon TTG, CTG and TTG, respectively (Supplementary table  S2). Furthermore, majority of the PCGs terminated with the codon TAR (TAA/TAG/TGA), except for cox2 of Oedogonium sp1 and nad3 of Oedogonium sp1 and Oedogonium dentireticulatum, which terminated with AAA and AAG, respectively (Supplementary table S2). Excluding the stop codons, the mitogenome PCGs consisted of 3972-4052 codons, and all the eight Oedogoniales species showed very similar codon usage (Fig. 2). The six predominant codon families included Val (279.88 ± 9.82 codons), Phe (296.38 ± 11.42 codons), Ala (309.75 ± 11.70 codons), Ser (317.25 ± 8.71 codons), Ile (345.00 ± 13.62 codons) and Leu (577.13 ± 7.30 codons). Among them, Leu exhibited the highest usage bias (571-576 codons), whereas Cys (69.50 ± 3.77 CDs) showed the least number of codons. To gain an insight into the genetic codon bias of the eight Oedogoniales mitogenomes, the relative synonymous codon usage was evaluated. As shown in Fig. 3, the usage of synonymous codons was biased for most of the amino acids. The three most commonly used codons in the examined eight Oedogoniales mitogenomes were consistently UTA, CGU and AGU.

Phylogenetic analyses
The results of 18S rDNA sequences based on ML and BI methods showed that B. rectangularis var. hiloensis was sister to Oedogonium and Oedocladium in the phylogenetic tree with high support value (1 and 100 in BI and ML tree respectively) (Supplementary fig. S9). And Oedocladium species formed two clades clustering with Oedogonium, respectively, which indicated that both Oedogonium and Oedocladium were polyphyletic, which is in accordance with that reported in previous studies (Mei et al., 2007;Xiong et al., 2021). For the three species of unbranched filaments belonging to genus Oedogonium which could not be defined as concrete species for a lack of sexual characters, the results of 18S rDNA showed they were different from each other and also different from the known species reported before. The phylogenetic results by the coalescentbased analysis were consistent with the supermatrix analysis based on nucleotide dataset (Fig. 6). For the supermatrix analysis, there were three kinds of phylogenetic results (Figs 4-6). Phylogenetic trees based on the nucleotide dataset by BI and ML were consistent, while phylogenetic results based on the amino acid dataset were different by BI and ML methods, and the results based on the two kind datasets also were different.
The findings of ML and BI methods based on amino acid dataset revealed that Oedocladium prescottii clustered with Oedogonium capilliforme but showed a slight difference at the level of B. rectangularis var. hiloensis. The ML results (Fig. 4) indicated that B. rectangularis var. hiloensis was sister to Oedogonium and Oedocladium and separated from the others, whereas BI results showed that Oedogonium sp2 was at the base and that B. rectangularis var. hiloensis clustered with the clade including the other Oedogonium species and Oedocladium prescottii (Fig. 5). Phylogenetic trees by the coalescent-based analysis and the supermatrix analysis based on nucleotide dataset indicated that all eight Oedogoniales species separated into two clades with a high support value (1 and 100 in  (Fig. 6).

Substitution rate estimation
The dN and dS rates for all the 12 PCGs of each Oedogoniales mitogenome were calculated using the ML method implemented in PAML (Supplementary  table S3 Subsequently, the branch model of PAML was employed to calculate the dN/dS ratios of the 12 protein-coding genes of the three genera, with Oedocladium prescottii and B. rectangularis var. hiloensis labelled as the foreground branches ( Table 2). The results showed that cob, cox1 and nad4 had FDR-adjusted p < 0.05, which implied that the dN/dS ratios of these genes were significantly different among the genera Oedocladium, Bulbochaete and Oedogonium, suggesting the occurrence of rapid evolution in Oedocladium prescottii and B. rectangularis var. hiloensis.

Discussion
Comparative analyses of the eight Oedogoniales mitogenomes showed they were relatively conserved at the levels of structures and organizations. All the newly determined mitogenomes showed the same typical circular mapping as that of B. rectangularis var. hiloensis mitogenome. Previous studies have indicated that the mitogenomes of colonial volvocine algae (Chlorophyta, Chlamydomonadales) had linear or circular architectures, and that the changes in the mitogenome architecture may be correlated to the phylogeny or evolution of this group (Moore & Coleman, 1989;Smith & Lee, 2010;Hamaji et al., 2013Hamaji et al., , 2017Smith et al., 2013;Featherston et al., 2016;Hu et al., 2019b). The results of the present study showed that the mitogenome architecture of Oedogoniales was relatively conserved. While the contents of PCGs and rRNAs in the Oedogoniales mitogenomes were conserved, some differences were noted in the contents of tRNA genes and introns. The results of synteny analysis indicated that the eight mitogenomes were relatively conserved, and that the three genera showed no significant difference, except for the rearrangement and inversion in the sixth and seventh blocks of Oedogonium sp3 mitogenome. Furthermore, the ANI analysis revealed Oedogonium dentireticulatum showed high similarity with Oedogonium sp3 (ANI of 96.32%). Phylogenetic results based on amino acid and nucleotide datasets showed that Oedogonium dentireticulatum clustered with Oedogonium sp3 Fig. 4. Phylogenetic tree based on 12 mitochondrion genes was generated by the amino acid data sets based on BI method. Numbers at the branches represent Bayesian posterior probabilities. Scale bar indicates substitutions per site.
with high support value (1 and 100 in BI and ML tree respectively). Most of the PCGs of the eight Oedogoniales mitogenomes used the conventional start codon ATG and stop codons TAR (TAA/ TAG/TGA), except for some PCGs of certain species with start codons TTG, CTG and TTG and stop codons AAA and AAG, which may be related to post-transcriptional modification during mRNA maturation (Sun et al., 2021). The PCGs of the eight mitogenomes consisted of 3972-4052 codons and showed very similar codon usage, with Leu exhibiting the highest usage bias (571-576 codons), which may be associated with the coding function of mitochondrion of encoding many transmembrane proteins (Lu et al., 2013). Moreover, the synonymous codon preferences for the eight Oedogoniales species were found to be conserved, which may be attributed to their close phylogenetic relationship. The results of phylogenetic analyses of 18S rDNA and the PCGs sequences of the mitogenomes by coalescent-based and supermatrix analyses based on amino acid and nucleotide datasets by ML and BI methods supported the findings of previous studies, indicating that Oedogonium was polyphyletic, species of Oedocladium clustered with Oedogonium. However for the position of  Bulbochaete, different phylogenetic results were incongruent.
Previous study used different reconstruction methods based on mitogenome PCGs to analyse the phylogenetic relationship of colonial volvocine algae and found that stochastic error, linkage group and biological factors may cause phylogenetic incongruence (Hu et al., 2020). It has been proposed that larger sample sizes can substantially improve phylogenetic results (Pollock et al., 2002); hence, future studies need to assess incongruence with more related species.
In the present study, the substitution rates were evaluated by ML method of PAML, which is the most accurate method currently available for measuring substitution rates (Guindon et al., 2010;Smith et al., 2013). The results of statistical analyses showed no significant difference between the three genera, and the dN/dS ratios based on branch model revealed that cob, cox1 and nad4 of Oedocladium prescottii and B. rectangularis var. hiloensis were putative fast-evolving genes. Thus, increasing the amount of molecular data indicated that the traditional taxonomy of Oedogoniales did not define natural groups and the evolutionary position of Oedocladium, Oedogonium and Bulbochaete remains uncertain. We supposed that species of Oedocladium and Bulbochaete may have undergone rapid evolution and further molecular studies, especially on the genus Bulbochaete, are needed to confirm this hypothesis and resolve the phylogenetic relationship of the order Oedogoniales.

Supplementary material
The