IMPROVED TOMATO GENOME REFERENCE USING FULL-LENGTH BACs, BIONANO GENOME MAPS AND SGN COMMUNITY RESOURCES
The Solanum lycopersicum cultivar Heinz 1706 genome is the primary reference model organism for many solanaceous species. The previous genome build SL2.50 contained 23,640 contig gaps and 79 scaffold gaps where the size was an approximation. The total size of gap regions varied from 7.23% to 14.61% per chromosome for 10.36% over all the chromosomes.
We have integrated 1,087 full-length phase htgs3 BACs into the tomato genome to cover gap regions and replace shorter whole genome shotgun contigs which removed 8,620,963 bases (8.6Mb) of contig gaps. The reduction in contig gaps varied from 4.82% to 63.71% per chromosome. BioNano genome maps were generated for Heinz 1706 that largely confirmed the correctness of the current build. Chromosome 0 contains scaffolds that could not be localized in the genome build. In the new build, we were able to integrate 2 additional scaffolds from chr 0 into chrs 2 and 9, fix 2 inversions in chr 12, fix 1 inversion in chr 3 and resize 23 gaps accurately using CMaps from the BioNano assembly.
Tomato optimized annotation pipelines were run using RNAseq data kindly provided by members of the Solanaceae community. Gene identifiers will be transferred from ITAG2.4 to corresponding ITAG3.0 models to maintain backward compatibility. In cases where a gene was modified. its version number will be updated to reflect the change. Corrections submitted by the SGN user community for ITAG 2.4 gene models and build SL2.50 will also be incorporated into ITAG3.0 and SL3.0.