Gene cluster (GC) sequence database in MMseqs2 format and tab-separated table containing information about the set of 60,515 lineage-specific GCs (54,343 of unknown function and 6,172 GCs of known function) within the Cand. Patescibacteria phylum, more commonly known as Candidate Phyla Radiation (CPR).
The GCs are part of the agnostosDB (dbf02445-20200519) that can be found at https://doi.org/10.6084/m9.figshare.12459056 , and is described in Vanni et al. 2020.
Visit the website https://dark.metagenomics.eu/ for more detailed information about the analyses.