posted on 2022-06-02, 17:37 authored by Bardya Djahanschiri, Gisela Di Venanzio, Jesus S. Distel, Jennifer Breisch, Marius Alfred Dieckmann, Alexander Goesmann, Beate Averhoff, Stephan Göttig, Gottfried Wilharm, Mario F. Feldman, Ingo Ebersberger

This file gives detailed information about the ESGC identification and provides the underlying data. Sheet 1 gives for each gene along the genome of A. baumannii ATCC 19606 the number of taxa per clade (and per international clone type) that harbor an ortholog. Sheet 2 gives, for each gene, the input vector used for the dissimilarity calculations, the individual thresholds (5th percentile) and, as an example, the calculated dissimilarity between each gene and the gene immediately upstream. The full matrix containing all pairwise dissimilarity calculations for the prediction of clusters is deposited as txt format on figshare ( Sheet 3 lists the identified graph components along with abundance statistics (median of the proteins in a component) across the clades (both absolute and relative), retention difference between ACB vs. non-ACB (RD), and CCD scores. Sheet 4 lists detailed contextual information for the components including all functional annotations from various sources. Detailed explanations of the column headers for all tables are placed in the sheet ‘ColumnLegends’.