1/1

# Morphology-based neighbour-net of seed plants: quick exploratory data analysis of the matrix of Rothwell & Stockey (2016)

These are two figures illustrate the problem when using consensus trees based on a sample of most-parsimonious trees or Bayesian-inferred posterior probabilities to establish phylogenetic relationships of extant and extinct seed plants.

The primary signal from the matrix is largely incompatible as reflected by the pronounced box-like parts of the graph. The matrix' Delta value, a measure for treelikeness (Holland et al.,

The miscellaneous signal provided by the matrix nevertheless supports to some degree the currently preferred 'Gnetifer' hypothesis, a sister relationship of Gnetales (Gnetidae) – a group of three long-branching enigmatic genera long considered to be intermediate between gymnosperms (s.str.) and angiosperms with a much larger diversity in the past – and the conifers. The recently described Petriellales a potential sister group of the angiosperms (but note the long distal edge bundles seperating both groups, and short proximal edges shared by both groups). Otherwise the matrix supports the same groups and has similar resolution issue as earlier matrices: group labels follow the terms introduced by Hilton & Bateman, J. Torrey Bot. Soc., 2006. The graph depicts the overall differentation patterns encoded in the matrix as well as the (putative) derivedness of each taxon/taxon group, well reflected by the circular arrangement of the taxa in relation to the assumed (primitive) outgroup, the progymnosperms.

Extant (living) taxa are highlighted by blue font in both figures. Abbr. used in both figures: D.m., D.t. = Doyleales (see Rothwell & Stockey, 2006); E. =

**inferred from the data matrix generated by Rothwell & Stockey,**

Figure 1: Neighbour-net, a planar splits graph, based on a matrix of mean morphological distancesFigure 1: Neighbour-net, a planar splits graph, based on a matrix of mean morphological distances

*Am. J. Bot.*, 2016 and used by Coiro et al.,*bioRxiv,*2017.The primary signal from the matrix is largely incompatible as reflected by the pronounced box-like parts of the graph. The matrix' Delta value, a measure for treelikeness (Holland et al.,

*Mol. Biol. Evol.*, 2002), is 0.322 (matrices providing signals allowing inferring relatively resolved phylogenetic trees have usually Delta values of < 0.15).The miscellaneous signal provided by the matrix nevertheless supports to some degree the currently preferred 'Gnetifer' hypothesis, a sister relationship of Gnetales (Gnetidae) – a group of three long-branching enigmatic genera long considered to be intermediate between gymnosperms (s.str.) and angiosperms with a much larger diversity in the past – and the conifers. The recently described Petriellales a potential sister group of the angiosperms (but note the long distal edge bundles seperating both groups, and short proximal edges shared by both groups). Otherwise the matrix supports the same groups and has similar resolution issue as earlier matrices: group labels follow the terms introduced by Hilton & Bateman, J. Torrey Bot. Soc., 2006. The graph depicts the overall differentation patterns encoded in the matrix as well as the (putative) derivedness of each taxon/taxon group, well reflected by the circular arrangement of the taxa in relation to the assumed (primitive) outgroup, the progymnosperms.

**Figure 2: Ambiguous support for alternative phylogenetic splits**using four support measures: non-parametric bootstrapping under three optimality criteria: maximum likelihood (1000 pseudoreplicates generated with Lewis',*Syst. Biol.*, 2001 one-parameter ('Markov') model implemented in RAxML), maximum parsimony, and least-squares estimated via the neighbour-joining algorithm (10,000 pseudoreplicates generated with PAUP*); and posterior probabilities (using MrBayes 3.2 and Lewis' model; Coiro et al. used the same programme but a general-time reversible model with potentially 9 x 9 substitution categories). Note the often low to diminishing support of phylogenetic splits seen in the originally published strict-consensus tree of a set of most-parsimonious trees (green) or their alternatives (red). Black brackets/numbers refer to relationships that are not in conflict with the originally published tree (i.e. relate to polytomies in the strict consensus tree). The most important observation is that some splits with PP > 0.5, i.e. one alternative being more probable than all others togehter, have equally high support using bootstrapping (under ML and/or LS/NJ) but others do not. A number of splits conflicting with the topology of the original tree have relatively high support, independent of the method used. Rothwell & Stockey's crown clade including Doyleales, Gnetidae, Petriellales, and angiosperms is rejected by the differential support patterns; instead Doyleales are supported as conifers/relatives of extant conifers as found by Coiro et al. (2017) using Bayesian inference. On the other hand, Coiro et al.'s placement of*Caytonia*as sister of Petriellales and angiosperms (sister to corystosperms in Rothwell & Stockey's strict consensus tree) finds little support from the data and maybe an over-parametrisation artefact.Extant (living) taxa are highlighted by blue font in both figures. Abbr. used in both figures: D.m., D.t. = Doyleales (see Rothwell & Stockey, 2006); E. =

*Emporia*, extinct, ancient conifer/conifer-relative representing the order Voltziales in this matrix; Cx., Mx. = Cordaixylon, Mesoxylon, two representatives of the extinct order Cordaitales; G. =*Ginkgo*.