Supplementary Information from Resurrecting ancestral genes in bacteria to interpret ancient biosignatures

Two datasets, the geologic record and the genetic content of extant organisms, provide complementary insights into the history of how key molecular components have shaped or driven global environmental and macroevolutionary trends. Changes in global physico-chemical modes over time are thought to be a consistent feature of this relationship between Earth and life, as life is thought to have been optimizing protein functions for the entirety of its approximately 3.8 billion years of history on the Earth. Organismal survival depends on how well critical genetic and metabolic components can adapt to their environments, reflecting an ability to optimize efficiently to changing conditions. The geologic record provides an array of biologically independent indicators of macroscale atmospheric and oceanic composition, but provides little in the way of the exact behaviour of the molecular components that influenced the compositions of these reservoirs. By reconstructing sequences of proteins that might have been present in ancient organisms, we can downselect to a subset of possible sequences that may have been optimized to these ancient environmental conditions. How can one use modern life to reconstruct ancestral behaviours? Configurations of ancient sequences can be inferred from the diversity of extant sequences, and then resurrected in the laboratory to ascertain their biochemical attributes. One way to augment sequence-based, single-gene methods to obtain a richer and more reliable picture of the deep past, is to resurrect inferred ancestral protein sequences in living organisms, where their phenotypes can be exposed in a complex molecular-systems context, and to then link consequences of those phenotypes to biosignatures that were preserved in the independent historical repository of the geological record. As a first step beyond single-molecule reconstruction to the study of functional molecular systems, we present here the ancestral sequence reconstruction of the beta-carbonic anhydrase protein. We assess how carbonic anhydrase proteins meet our selection criteria for reconstructing ancient biosignatures in the laboratory, which we term paleophenotype reconstruction.