Tetep-1.1_pseudomolecule.fasta.gz: Tetep genome sequences (chromosome pseudomolecules) in fasta format, gzip compressed, corresponding to the DDBJ/ENA/GenBank accession GCA_004348155.2 or QQAJ00000000.
Tetep-1.1_pseudomolecule.softmasked.fasta.gz: Repeat masked (lowercased) sequences in fasta format, gzip compressed.
Tetep-1.1b_pseudomolecule.fasta.gz: A modified version with contig tig00011498 manually anchored to chr11:21344774-21363909, yet to be confirmed.
## Predicted gene sets
Tetep-1.1_fgenesh.tar.gz: Gene set predicted by Fgenesh.
Tetep-1.1b_maker.tar.gz: Gene set predicted by MAKER-P (note the MAKER annotation is based on the genome version 1.1b).
## NLR sequences predicted for 4 rice cultivars and Brachypodium distachyon
Files are given in fasta format, including full coding sequences (nucleotide), full protein sequences, nucleotide sequences of NB-ARC domain regions, and protein sequences of NB-ARC domain regions. All NB-ARC type genes are provided, and the NB-LRR type genes are marked by "NLR".
Tetep.NLRs.tar.gz: NLR genes predicted in Tetep genome.
MH63.NLRs.tar.gz: NLR genes predicted in Minghui63 genome (genome and annotation [MH63RS1]: ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/001/623/365/GCA_001623365.1_MH63RS1/)
R498.NLRs.tar.gz: NLR genes predicted in R498 genome (genome and annotation [V3]: http://www.mbkbase.org/R498/).
Bdistachyon.NLRs.tar.gz: NLR genes predicted in Brachypodium distachyon (genome [v3.0] and annotation [v3.1]: https://phytozome.jgi.doe.gov/pz/portal.html#!info?alias=Org_Bdistachyon).
## Files for phylogenetic analysis
Figure_4c
This tarball contains phylogenetic trees constructed for "Figure 4c. Paired NLRs identified in the Nipponbare genome that are Pi-ta or Pi-kh homologs"
Figure_4c.Pita_Pikh_pairs.full_cds.clustalw2.fas: ClutalW2 aligned coding sequences (codon-based alignments). Fasta format.
Figure_4c.Pita_Pikh_pairs.full_cds.FastTree.nwk: Phylogenetic tree constructed using FastTree with full coding sequences. Newick format.
Figure_4c.Pita_Pikh_pairs.NB-ARC.cds.clustalw2.fas: ClutalW2 aligned nucleotide sequences of NB-ARC domain (codon-based alignments). Fasta format.
Figure_4c.Pita_Pikh_pairs.NB-ARC_cds.FastTree.nwk: Phylogenetic tree constructed using FastTree with nucleotide sequences of NB-ARC domain alone. Newick format.
Figure_4c.Pita_Pikh_pairs.NB-ARC_pep.clustalw2.fas: ClutalW2 aligned protein sequences of NB-ARC domain. Fasta format.
Figure_4c.Pita_Pikh_pairs.NB-ARC_pep.RAxML.nwk: Phylogenetic tree constructed using RAxML with protein sequences of NB-ARC domain alone. Newick format.
Figure_6
This tarball contains phylogenetic trees constructed for "Figure 6. Phylogenetic tree of paired NLRs".
Figure_6.NLR_pairs.full_cds.clustalw2.fas: ClutalW2 aligned coding sequences (codon-based alignments). Fasta format.
Figure_6.NLR_pairs.full_cds.FastTree.nwk: Phylogenetic tree constructed using FastTree with full coding sequences. Newick format.
Figure_6.NLR_pairs.NB-ARC_cds.clustalw2.fas: ClutalW2 aligned nucleotide sequences of NB-ARC domain (codon-based alignments). Fasta format.
Figure_6.NLR_pairs.NB-ARC_cds.FastTree.nwk: Phylogenetic tree constructed using FastTree with nucleotide sequences of NB-ARC domain alone. Newick format.
Figure_6.NLR_pairs.NB-ARC_pep.clustalw2.fas: ClutalW2 aligned protein sequences of NB-ARC domain. Fasta format.
Figure_6.NLR_pairs.NB-ARC_pep.RAxML.nwk: Phylogenetic tree constructed using RAxML with protein sequences of NB-ARC domain alone. Newick format.
Supplementary_Figure_2
This tarball contains phylogenetic trees constructed for "Supplementary Figure 2. Phylogenetic tree of all NLRs identified in Tetep (*.fgenesh*), Nipponbare (LOC_Os*) and B. distachyon (Brad*)".
Supplementary_Figure_2.NLR_phylogeny.full_cds.clustalw2.fas: ClutalW2 aligned coding sequences (codon-based alignments). Fasta format.
Supplementary_Figure_2.NLR_phylogeny.full_cds.FastTree.nwk: Phylogenetic tree constructed using FastTree with full coding sequences. Newick format.
Supplementary_Figure_2.NLR_phylogeny.NB-ARC_cds.clustalw2.fas: ClutalW2 aligned nucleotide sequences of NB-ARC domain (codon-based alignments). Fasta format.
Supplementary_Figure_2.NLR_phylogeny.NB-ARC_cds.FastTree.nwk: Phylogenetic tree constructed using FastTree with nucleotide sequences of NB-ARC domain alone. Newick format.
Supplementary_Figure_2.NLR_phylogeny.NB-ARC_cds.RAxML.nwk: Phylogenetic tree constructed using RAxML with nucleotide sequences of NB-ARC domain alone. Newick format.
Supplementary_Figure_2.NLR_phylogeny.NB-ARC_pep.clustalw2.fas: ClutalW2 aligned protein sequences of NB-ARC domain. Fasta format.
Supplementary_Figure_2.NLR_phylogeny.NB-ARC_pep.RAxML.nwk: Phylogenetic tree constructed using RAxML with protein sequences of NB-ARC domain alone. Newick format.
## Miscellanies
Tetep.NLR.Sanger.sequences.fasta: Assembled sanger sequences for 93 cloned NLRs.
Tetep_Illumina.mapped.to.Nipponbare.snpEff.vcf.gz: Variants with predicted effects after mapping Tetep Illumina reads to Nipponbare genome. The variants were called by GATK HaplotypeCaller (https://software.broadinstitute.org/gatk/), and the effects were predicted using SnpEff (http://snpeff.sourceforge.net/) based on the MSU RGAP7 gene models (http://rice.plantbiology.msu.edu/). The file was compressed with bgzip (http://www.htslib.org/doc/bgzip.html).
Tetep_Illumina.mapped.to.Nipponbare.snpEff.vcf.gz.tbi: Tabix (http://www.htslib.org/doc/tabix.html) index file for "Tetep_Illumina.mapped.to.Nipponbare.snpEff.vcf.gz".
Tetep_Illumina.mapped.to.Nipponbare.snpEff.genes.txt.gz: Predicted effects for each gene in txt format.
OrthoFinder.Orthogroups.csv: Tab separated text file contains orthogroups identified using Orthofinder (https://github.com/davidemms/OrthoFinder) among all predicted NB-ARC type genes in Tetep, Nipponbare, MH63, R498 and Brachypodium distachyon.
OrthoFinder.Orthologues_Tetep.tar.gz: Orthologues between Tetep and each of other 4 genomes reported by OrthoFinder.