Geography and climate drive the distribution and diversification of the cosmopolitan cyanobacterium Microcoleus (Oscillatoriales, Cyanobacteria)

ABSTRACT Despite the extensive diversity of bacteria and their importance to the fundamental functioning of terrestrial ecosystems, their distribution patterns are still not fully known. To fill the gap and further understand the biogeographic patterns in bacteria, we investigated the phylogeographic structure and the underlying drivers of diversification among populations of the cyanobacterium Microcoleus spp. The phylogenetic history was reconstructed using 16S rRNA genes and the 16S–23S internal transcribed spacer (ITS) of 495 Microcoleus spp. isolates. Ancestral area and state reconstruction was employed to investigate the distributional and ecological patterns within Microcoleus. Both isolation by distance and isolation by environment were tested with distance matrices analysis. The phylogenetic signal tests were conducted in order to assess the influence of the climatic preferences on the diversification of Microcoleus isolates. The distribution and phylogenetic diversification of Microcoleus are driven by both isolation by distance and environment, leading to at least 13 distinct lineages that could represent novel cyanobacterial species. Microcoleus spp. exhibited a distinct phylogeographic structure within the respective lineages. The ancestral area and state reconstruction revealed that Microcoleus most likely arose in Europe in terrestrial habitats. The phylogenetic signal showed that the phylogeny significantly affects the climatic preferences of Microcoleus strains. Geographic distance and contemporary climatic conditions play significant roles in shaping the distribution and diversification of Microcoleus. The observed patterns of distribution may shift in the future due to the impact of climate change. Highlights Microcoleus exhibited distinct phylogeographic structure within the respective lineages. Geographic and environmental heterogeneity affect Microcoleus distribution and diversification. Genetically distinct lineages coexist at the same site.


Introduction
The number of microbial species worldwide is immense, and we have only a limited understanding of their dispersal and distribution patterns. Articulating microbial biogeographic patterns is hampered by the inconsistent use of species concepts, understudied microhabitats, underestimated diversity and unknown rates of dispersal (Fierer, 2008), and this is further complicated by different methodological approaches (e.g. sequencing strategy as reviewed in Hanson et al., 2012). With the development of next-generation sequencing, it has become possible to better detect biogeographic patterns and mechanisms of speciation (selection, dispersal, genetic drift and mutations; Hanson et al., 2012). Different microbial species exhibit distinct distribution and evolutionary patterns; therefore, describing the biogeographic patterns in nonmodel organisms is an essential step to better understanding microbial biogeography in a broader context. Microorganisms differ from most macroorganisms because they have larger dispersal ability, population sizes, and may form resistant propagules within their life cycles (Foissner, 2006;Fontaneto & Brodie, 2011). Initially, microbiologists assumed that spatial differentiation was not evolutionarily significant -as noted by Baas-Becking's (1934) tenet 'everything is everywhere, but the environment selects'. Recent studies (e.g. Finlay, 2002;Fenchel, 2003) support this assertion, but employ only morphological data as evidence. However, molecular studies have shown that the dispersal of free-living microorganisms may actually be limited (e.g. Miller et al., 2007;Bates et al., 2013;Aguilar et al., 2014;Ribeiro et al., 2020), but this is variable because dispersal barriers may only be temporary (Bahl et al., 2011;Dvořák et al., 2012). Reno et al. (2009) demonstrated the existence of dispersal barriers on a continental scale using genome data in archaeon Sulfolobus. Thus, morphological data seem to have limited resolution. This may be exacerbated by cryptic speciation -low morphological diversity which conceals much larger genetic diversity (Leliaert et al., 2014).
In cyanobacteria, studies focused on phylogeographic structure seem to provide ambiguous results. For instance, Microcystis cyanobacteria did not exhibit any phylogeographic pattern according to Van Gremberghe et al. (2011) and Ribeiro et al. (2020), but Raphidiopsis raciborskii populations have distinct structure (Ribeiro et al., 2020) as do those of Microcoleus vaginatus . While these studies indicate the complex nature of the diversification and distribution of cyanobacteria, they had only limited population-level resolution since each locality was mainly represented by a single strain and sequence.
Phylogenetic signal is a measure of the significance of the relationship between species traits and genetic relatedness (Blomberg et al., 2003) and has been used to highlight the importance of ecology on the distribution of microorganisms (e.g. Aguilar et al., 2014;González-Rocha et al., 2017). Even without a clear consensus on measuring the phylogenetic signal, this concept needs consideration when detecting biogeographic species patterns (Losos, 2008;Crisp & Cook, 2012). In cyanobacteria, the phylogenetic signal has previously been determined with morphological, physiological (e.g. Uyeda et al., 2016) and some environmental factors including nutrients, pH and sea surface temperature (e.g. Larkin et al., 2016).
Microcoleus vaginatus is a cosmopolitan, filamentous bundle-forming cyanobacterium that inhabits benthic and subaerial habitats (Garcia-Pichel et al., 1996). It is an essential part of biological soil crusts in both hot and cold deserts, where it plays an important role in the biogeochemical cycle and stabilizing soil particles with its exopolysaccharides (Garcia-Pichel & Wojciechowski, 2009). Microcoleus is polyphyletic (e.g. Siegesmund et al., 2008;Hašler et al., 2012), and Strunecký et al. (2013) defined a new type material for M. vaginatus. They note that M. vaginatus is composed of at least six species, but they were only vaguely defined, and thus revision of the genus Microcoleus is needed. In this paper, we seek to shed some new light on the biogeography and the evolution of free-living organisms using phylogenetic reconstruction of cyanobacterium Microcoleus. Here, we recognize the existence of many lineages within M. vaginatus and we will refer to them as 'Microcoleus spp.'.We investigated patterns of diversity from a single Microcoleus spp. population to continent-scale distribution, on both an inter-and intraspecies level. Moreover, we identify the relationship between climate, geography and genetic diversity.

Data collection and cultivation
Collections of Microcoleus spp. were made during 2019. Once collected, samples were placed directly into plastic bags. Strains were obtained from 75 environmental samples originating from all continents except for South America. Sampling locations had diverse climates and habitats: soil (dry convex surface), puddles (ephemeral concave water bodies 5-10 cm in depth), concrete, moss vegetation and rocks ( Fig. 1; Supplementary table S1). A small portion of each sample was placed in 10 ml capped tubes with liquid Zehnder medium (Staub, 1961). Part of the grown biomass was then transferred to Petri dishes on agar solidified (1.5%) Zehnder medium. Unialgal cultures were isolated following techniques described by Hašler et al. (2012). From each environmental sample, one to 11 cultures from one filament were obtained, and altogether 495 clonal cultures of Microcoleus spp. were established. The entire culture collection is currently maintained at the Department of Botany, Palacký University in Olomouc, Czech Republic. Morphology of the strains was inspected under 400× magnification using a Zeiss Primo Star (Oberkochen, Germany) light microscope and they were identified following the taxonomic system sensu Komárek & Anagnostidis (2005). We evaluated morphological characters of filaments: division, shape, size and the presence of calyptra. All isolates were grown in 10 ml capped tubes with liquid Zehnder medium and maintained at 22 ± 1°C with an average photon flux density of 20 µmol photons m -2 s -1 under 12h light/12h dark light regime.

Phylogenetic analysis
In addition to 495 Microcoleus spp., four Kamptonema animale clonal cultures were obtained from the sample N3 (Norway, Europe) and used as an outgroup in the phylogenetic tree. All 16S rRNA and 16S-23S ITS sequences have been deposited in GenBank (www.ncbi.nlm.nih.gov/genbank; see accession numbers in Supplementary table S1). Both multiple sequence alignment of 16S rRNA and 16S-23S ITS genes and alignment processing were performed by the Muscle algorithm (Edgar, 2004) in AliView (Larsson, 2014). The multiple sequence alignment was trimmed using trimAl 1.4.22 (Capella-Gutiérrez et al., 2009) with the option -automated1. The phylogenetic tree was inferred the maximum-likelihood tree in IQ-TREE 1.6.1 (Nguyen et al., 2015). The most appropriate model -Trn+I+G was selected based on Bayesian Information Criterion (Schwarz, 1978). The tree topology was tested using ultrafast bootstrapping, implemented within the same software, with 2000 replications (Hoang et al., 2018).

Ancestral area and state reconstruction
Ancestral state reconstruction (ASR) of geographic origin and habitats of Microcoleus spp. was performed using Bayesian binary MCMC (BBM) analysis that is implemented within RASP v4.2 (Yu et al., 2015(Yu et al., , 2020. The geographic origin of Microcoleus spp. strains was divided into seven regions: Europe, Arctic, Antarctic, Australia, North America, Asia and Africa. The ancestral area reconstruction and the estimation of the spatial patterns of geographic diversification within Microcoleus spp. were inferred using the Bayesian binary method (BBM) which was selected due to its capability of presenting single distribution areas. The BBM was run with the fixed state frequency model (Jukes-Cantor) with equal among-site rate variation for 50 000 generations and 10 chains each. The maximum number of ancestral areas was set to seven for geographic origin and five for ancestral habitats.

Statistical analysis
To assess the effect of the climate on the emergence of Microcoleus spp. strains, we measured the phylogenetic signal. Firstly, 19 bioclimatic variables for locations of sampled strains were downloaded from the WorldClim v2.1 database (Fick & Hijmans, 2017) at 2.5 arc minutes resolution and then extracted in R (version 4.0.0, R Development Core Team) using the package 'raster' v3.3-13 (Hijmans, 2020) (Supplementary table S2). Then, we measured the phylogenetic signal of bioclimatic variables using three signal indices. Pagel's λ (Pagel, 1999) was estimated with the function phylosig (package 'phytools' v0.7-47; Revell, 2012), Abouheif's C mean (Abouheif, 1999) was calculated with the function abouheif.moran and the method oriAbouheif (package 'adephylo' v1.1-11; Jombart et al., 2010) and Moran's I (Moran, 1950) was calculated with the function phyloSignal (package 'phylosignal' v1.3; Keck et al., 2016). The Pagel's λ value of zero corresponds to traits, which do not have a phylogenetic signal (independent evolution of climatic niche from the phylogeny), and one corresponds to traits having a strong phylogenetic signal (dependent evolution of climatic niche from the phylogeny) (Pagel, 1999). To test the significance of the estimated lambda value when there is no phylogenetic signal (λ = 0, null hypothesis), the likelihood-ratio test (−2[logL0−logL1]) was employed. This test estimates the difference between the two models, where logL0 represents the likelihood of λ = 0 and logL1 represents the likelihood of λ = 1. Its statistical significance was analysed using the χ 2 test.
Abouheif's C mean and Moran's I, with 999 simulations, were also used to detect the phylogenetic signal in the climatic niche space. Both indices' values are ranging from −1 to 1, where the value of −1 represents a lack of phylogenetic signal and the value of 1 represents the presence of phylogenetic signal in traits. If there was statistically significant high phylogenetic autocorrelation (Moran's I and Abouheif's C mean greater than zero), then the null hypothesis of no phylogenetic autocorrelation in the dataset could be rejected.
The Mantel test (Mantel, 1967) was used to investigate the correlation between climate, geographic and genetic distance. It was performed in R with the function mantel (package 'vegan' v2.5-6; Oksanen et al., 2016). First, the genetic distance matrix was obtained from Mega7 (Kumar et al., 2016) using Tamura-Nei substitution model (Tamura & Nei, 1993). Second, the spatial matrix was built using the geographic coordinates of localities where we sampled our strains and inferred using the function distm and the function distGeo (package 'geosphere' v1.5-10; Hijmans et al., 2017) in R. Lastly, using the function findCorrelation (package 'caret' v6.0-86; Kuhn, 2008), highly correlated bioclimatic variables were eliminated (r < 0.9). We kept the following bioclimatic variables: precipitation seasonality, precipitation of driest quarter, mean temperature of warmest quarter, precipitation of coldest quarter, mean diurnal range, isothermality, temperature seasonality, maximum temperature of warmest month, temperature annual range, mean temperature of wettest quarter and mean temperature of the driest quarter. Prior to inference of the environmental matrix using the function dist and the euclidean method in R, all kept bioclimatic variables were centred and scaled. Significances of correlations between genetic and two other matrices were calculated on the basis of 9999 randomized permutations with the Pearson correlation coefficient.

The diversity within Microcoleus spp.
Phylogenetic analysis of 16S rRNA and 16S-23S ITS sequences revealed 13 monophyletic lineages of Microcoleus spp. (Figs 2, 3, Supplementary fig. S1). Some lineages were more diverse than others in our collection. Lineage 1 was the most diverse one containing 149 strains, followed by lineages 6, 10 and 11, which all had ≥ 50 strains. Lineages 5, 7, 9, 12 and 13 were least diverse containing <10 strains and in the rest of lineages, the number of strains ranged from 10 to 50 (details in Supplementary table S3).
We detected different lineages of Microcoleus spp. coexisting within a sampling site (Supplementary  table S4). In 45 samples we found just one lineage, in 21 samples we found two lineages and in eight samples we discovered three. In one sample from Europe (AT16), we found four lineages.

Phylogeography and evolution of habitat preference in Microcoleus spp.
The phylogeny revealed that isolates mostly followed a clustering pattern according to their geographic origin within the respective lineages (Fig. 2). Within lineage 1, we observed a highly supported subclade (bootstrap values ≥ 99) composed of African and European strains. The exceptions from the clustering pattern were two North American strains that grouped with European isolates. Lineage 2 included a well-supported North American and European subclades. Lineage 3 included a highly supported and diverse subclade of Arctic strains, a European subclade and a subclade composed of strains from Europe and Asia. Within lineage 4, among two European subclades was a highly supported one encompassing North American strains. Six lineages (5, 7, 9, 10, 12, 13) were composed of only European strains. Lineage 6 was comprised of several European, North American, and Asian subclades. One African isolate had the longest branch in the tree and it formed a highly supported subclade with European strains. Moreover, two isolates from Asia and North America clustered with European strains, revealing incongruence to the clustering pattern according to geographic origin. Lineage 8 had well-defined subclades with significant bootstrap supports that included North American and European isolates. Yet, three North American strains appeared to be more closely related to the group of European ones. Within lineage 11 there were very well-supported subclades with Antarctic and European strains that were closely related to the Australian strains. Four strains forming three singleton nodes were not included in any of thirteen lineages, but they were counted as an additional lineage isolated from the respective samples.
Isolation by distance was tested using a correlation analysis of genetic and geographical distance with the Mantel test. Geographical and genetic distances between the strains were significantly correlated (r = 0.2971, p = 0.0001) (Fig. 4). This provides additional evidence that geographic distance plays an important role in the evolution of Microcoleus spp. lineages.
An ancestral reconstruction of Microcoleus habitats indicates that the ancestor of the lineages most likely originated in soils (Supplementary fig. S3). The clustering of Microcoleus spp. strains mostly followed their habitat preference within the respective lineages as well (Fig. 3). Within lineage 1 were several welldefined subclades including soil and puddle strains. However, few exceptions from the observed pattern were strains from soil and puddles that clustered together. Moreover, a strain isolated from concrete grouped with strains isolated from puddles and soil. Lineage 2 included two subclades with soil isolates and one with isolates from puddle and concrete. All strains from lineages 3, 5, 7 and 9 originated from soil, whereas lineage 13 included strains only from puddles. A single puddle isolate was found among strains isolated from soil within lineage 4. Lineage 6 contained several subclades from soils and a subclade isolated from a puddle. Moreover, four puddle strains  clustered with soil and concrete habitats. Lineage 8, 10 and 12 contained strains from soils and puddles. Lineage 11 encompassed strains isolated from all habitats. Isolates from soil, puddles and moss vegetation formed highly supported and diversified subclades, whereas a strain isolated from concrete and two from puddles did not appear to have followed the pattern of clustering according to habitat preference.

Phylogenetic signal
We tested the presence of phylogenetic signal in climatic niche space of Microcoleus spp. to investigate whether climatic preferences (temperature and precipitation) are well predicted by the phylogenetic relatedness. We observed a strong phylogenetic signal in bioclimatic variables after a removal of auto-correlated variables (Supplementary table S5). According to Pagel's λ, Moran's I, and Abouheif's C mean measurements, 11 bioclimatic variables exhibited a statistically significant (p ≤ 0.001) phylogenetic signal (Supplementary table S5).
A Mantel test was employed to examine the effect of the climate on the Microcoleus spp. diversity. The distance between contemporary climatic conditions calculated with the euclidean method and genetic distance showed a significant correlation (r = 0.2941, p = 0.0001). This relationship stresses the importance of the climate on the divergence of Microcoleus spp. lineages.

Discussion
Detecting microbial biogeographic patterns and discovering the mechanisms driving them are crucial for understanding the microbial diversity, distribution and evolution (Hanson et al., 2012). In this study, we examined the congruence between phylogenetic relationships, geography and the environment (climate and habitat) among Microcoleus spp. populations using 16S rRNA and 16S-23S ITS genes. We found that there is considerable genetic diversity within Microcoleus spp. clusters at both local and global scales. In addition, we note a significant influence of geographic and environmental heterogeneity in the distribution and diversification of Microcoleus spp. lineages.
Microcoleus is one of the dominant biological soil crust cyanobacteria (e.g. Boyer et al., 2002;Gundlapally & Garcia-Pichel, 2006;Strunecký et al., 2013). Of the thirteen lineages within Microcoleus spp. that we observed, we suspect that some of them might represent novel cyanobacterial species (Supplementary fig. S1). We noted high diversity among identified lineages. For instance, lineage 1 had 149 strains, which suggests it is more abundant than other lineages in the investigated environments or it is easier to isolate monoclonal culture (Supplementary table S3).
In general, the evolution of different prokaryotic lineages from an initial population is possible due to the high adaptability of these organisms, promiscuous gene exchange, reduced dispersal and restricted gene flow (Whitaker et al., 2003;Cadillo-Quiroz et al., 2012;Dvořák et al., 2015). In most studies, a single sampling site is usually represented by a single isolate (hence, one sequence). Thus, it would only be possible to detect a single lineage of Microcoleus spp. Population-level sampling is necessary to discover multiple lineages at a site. Employing whole-genome sequencing, for example, Chase et al. (2017) and Hunt et al. (2008) observed such a pattern in soil Curtobacterium and marine bacterium Vibrio, respectively. Studies investigating that pattern are still rare in cyanobacteria. Nevertheless, Pietrasiak et al. (2014) isolated two different species of cyanobacterium Symplocastrum from the same soil crust locality using 16S rRNA and 16S-23S ITS. Our study is consistent with the aforementioned findings and here we note the coexistence of several Microcoleus spp. lineages at the same site (Supplementary table S4). Close to one third of samples contained only one Microcoleus spp. lineage, while the rest had two, three or four. Additionally, our data illustrate the co-occurrence of distantly related lineages from different populations within the same locality, whilst more related ones were far apart. As evidenced in Prochlorococcus by Kashtan et al. (2014), the coexistence of genetically distinct lineages could represent one of the characteristic traits of free-living prokaryotes. Phylogenetic inference and ancestral area reconstruction analysis revealed the phylogeographic structure within the respective lineages of Microcoleus spp. (Fig. 2). An ancestral reconstruction of Microcoleus spp. geographic origin revealed that ancestors of our isolates most likely originated in Europe (Supplementary fig.  S2). However, this finding could be an artefact of the over-representation of European strains in our collection. Although most lineages were diversified clusters of isolates within their respective lineages, a few strains did not follow this pattern (Fig. 2). Such a relationship between isolates suggests a certain probability of potential gene flow among strains from Europe and North America on one side, and Europe, Asia and Africa on the other. The Mantel test results showed that geographic isolation can affect speciation in Microcoleus spp. (Fig. 4); as physical distance becomes smaller, strains of Microcoleus tend to be more genetically similar. Therefore, our results do not agree with Baas-Becking's (1934) tenet that everything is everywhere, nor with studies of Fenchel (2003) and Finlay (2002) due to the influence of the isolation by distance on the diversification of Microcoleus spp. A similar pattern was documented in the archaeon Sulfolobus islandicus (Reno et al., 2009). Thus, allopatry, i.e. speciation by geographically isolated populations, represents an important contributing factor in the speciation of Microcoleus, yet it may be of a temporary duration (Bahl et al., 2011;Dvořák et al., 2012). Some of the possible dispersal pathways of filamentous cyanobacteria between continents include the atmosphere (e.g. Sharma & Singh, 2010), animals (e.g. Moore, 1985), and human factors (reviewed in Curren & Leong, 2020). Nevertheless, we are unable to reconstruct the intensity of the gene flow between lineages in this study.
The climate has a significant effect on the diversification, distribution and composition of cyanobacterial assemblages in biological soil crusts (e.g. Büdel et al., 2009;Bahl et al., 2011;Garcia-Pichel et al., 2013). Amid environmental variables that could explain the genetic diversity of cyanobacteria are temperature, precipitation, composition of soil and crust types. Ribeiro et al. (2020) demonstrated that contemporary and past climatic conditions significantly affect the genetic diversity of global Microcystis aeruginosa and Raphidiopsis raciborskii populations. Moreover, Microcoleus-dominated cyanobacterial communities have been shown to be affected by temperature and precipitation (Muñoz-Martin et al., 2019). Our study is consistent with these findings as bioclimate variables significantly correlated with the Microcoleus spp. global genetic diversity. Thus, it showed that the global population structure of Microcoleus spp. is also affected by isolation by the environment.
Whilst phylogenetic signal is commonly investigated in macroorganisms (see Lososet et al., 2008) and algae (e.g. Škaloud & Rindi, 2013;Narwani et al., 2015), it remains understudied in prokaryotes. A strong phylogenetic signal was recently detected in some morphological, ecological and physiological traits in cyanobacteria (Uyeda et al., 2016). Three independent phylogenetic signal measurements in this study supported the existence of a high phylogenetic signal in some ecological traits (Supplementary  table S5). This trend suggests non-independent evolution between climate and phylogeny (i.e. closely related Microcoleus spp. strains tended to be more similar in their temperature and precipitation preferences than the more distantly related ones).
Additionally, we show that the strains also cluster by habitat within the respective lineages (Fig. 3). Microcoleus was initially described from soils (Gomont, 1892). Recently, strains of Microcoleus have been found in other habitats as well -freshwater epipelon, puddles, rocks and concrete Hašler et al., 2012;this study). The ASR of habitats revealed that the majority of our isolates originated from the soil (Supplementary fig. S3). However, its oversampling in this study may affect the observed pattern.
Elevated temperature, shifts in precipitation frequencies and anthropogenic influences are altering the structure of cyanobacterial communities, their abundances, growth rates and distribution (Flombaum et al., 2013;Steven et al., 2015;Fernandes et al., 2018). Our results suggest that diversification of Microcoleus spp. lineages might be driven by temperature and precipitation. Thus, climate change may alter the observed patterns of Microcoleus spp. diversity.
Our study showed biogeographic patterns of Microcoleus spp. populations and the relationship between geography, environment and genetic diversity. Isolation by distance and isolation by environment affected the distribution and the diversification of Microcoleus spp. strains. Consequently, at least thirteen distinct lineages were found that could be new cyanobacterial species with very similar morphologies. As climate is a driver of the evolution in Microcoleus spp., climate change may affect its distribution and diversity.

Funding
This research was funded by the Grant Agency of the Czech Republic (GAČR) with the grant 19-12994Y as well as by Internal Grant Agency (IGA) with the grant PrF-2022-002.