figshare
Browse
pbio.3002397.s001.docx (1.29 MB)

S1 Text -

Download (1.29 MB)
journal contribution
posted on 2023-12-05, 18:32 authored by Sourabh Palande, Joshua A. M. Kaste, Miles D. Roberts, Kenia Segura Abá, Carly Claucherty, Jamell Dacon, Rei Doko, Thilani B. Jayakody, Hannah R. Jeffery, Nathan Kelly, Andriana Manousidaki, Hannah M. Parks, Emily M. Roggenkamp, Ally M. Schumacher, Jiaxin Yang, Sarah Percival, Jeremy Pardo, Aman Y. Husbands, Arjun Krishnan, Beronda L Montgomery, Elizabeth Munch, Addie M. Thompson, Alejandra Rougon-Cardoso, Daniel H. Chitwood, Robert VanBuren

Fig A. Histogram of 3-way factors of the RNAseq samples before and after downsampling. The distribution of 3-way factors for family, tissue, and stress is plotted. The 16 families, 8 tissue types, and 10 stresses equate to 1,280 unique 3-way combinations, but we only observed 195 unique combinations in our dataset. The distribution of samples from the entire dataset is shown on the left, and the distribution of samples when downsampling the 30 most common 3-way combinations is shown on the right. Raw expression data underlying the graphs in this figure can be found in S7 Dataset, and code can be found in https://zenodo.org/records/8428609 [65]. Fig B. Factor-wise frequency plots of RNAseq samples before and after subsampling. The number of samples in each family, tissue type, or stress is plotted before (top) and after (bottom) subsampling. Raw expression data underlying the graphs in this figure can be found in S7 Dataset, and code can be found in https://zenodo.org/records/8428609 [65]. Fig C. Topology of Mapper graphs generated from the subsampled data. Samples from each node in the Mapper graph are colored by plant family (A), stress (B), or tissue type (C), using the subsampled data. The overall topology and sample distribution are similar to the Mapper graphs constructed with the full, unbalanced dataset, suggesting that sample distribution is not a major factor in our analyses. Fig D. Linear regression analysis of association of surrogate variables to one batch variable (BioProject), our biological variables of interest (stress, tissue, and family), and their pairwise interactions. All surrogate variables were regressed on either each variable or interaction individually to calculate adjusted R2 values. Table A. Enrichment of GreenCut2 genes in orthogroup-mapped Arabidopsis thaliana genes and stress-/tissue-correlated orthogroup-mapped genes. The proportion of GreenCut2 genes in the all the orthogroups used in this study was compared against the proportion of GreenCut2 genes in a list of all A. thaliana genes using a one-sided binomial test. The proportion of tissue lens and stress lens correlated orthogroup-mapped genes in GreenCut2 was compared against the proportion of GreenCut2 genes in the entire set of orthogroup-mapped genes using one-sided binomial tests. Tissue-correlated genes were hypothesized to be more likely to be in GreenCut2 than a random selection of orthogroup-mapped genes, and the stress-correlated genes were hypothesized to be less likely.

(DOCX)

History