1162304_Edwards,R_2021.pdf (2.29 MB)
Chromosome-length genome assembly and structural variations of the primal Basenji dog (Canis lupus familiaris) genome
journal contribution
posted on 2021-07-01, 01:37 authored by RJ Edwards, MA Field, JM Ferguson, O Dudchenko, J Keilwagen, BD Rosen, GS Johnson, ES Rice, LD Hillier, JM Hammond, SG Towarnicki, A Omer, R Khan, K Skvortsova, O Bogdanovic, RA Zammit, EL Aiden, WC Warren, John BallardBackground: Basenjis are considered an ancient dog breed of central African origins that still live and hunt with tribesmen in the African Congo. Nicknamed the barkless dog, Basenjis possess unique phylogeny, geographical origins and traits, making their genome structure of great interest. The increasing number of available canid reference genomes allows us to examine the impact the choice of reference genome makes with regard to reference genome quality and breed relatedness. Results: Here, we report two high quality de novo Basenji genome assemblies: a female, China (CanFam_Bas), and a male, Wags. We conduct pairwise comparisons and report structural variations between assembled genomes of three dog breeds: Basenji (CanFam_Bas), Boxer (CanFam3.1) and German Shepherd Dog (GSD) (CanFam_GSD). CanFam_Bas is superior to CanFam3.1 in terms of genome contiguity and comparable overall to the high quality CanFam_GSD assembly. By aligning short read data from 58 representative dog breeds to three reference genomes, we demonstrate how the choice of reference genome significantly impacts both read mapping and variant detection. Conclusions: The growing number of high-quality canid reference genomes means the choice of reference genome is an increasingly critical decision in subsequent canid variant analyses. The basal position of the Basenji makes it suitable for variant analysis for targeted applications of specific dog breeds. However, we believe more comprehensive analyses across the entire family of canids is more suited to a pangenome approach. Collectively this work highlights the importance the choice of reference genome makes in all variation studies.
Funding
This work was supported by the University of New South Wales/School of Biotechnology and Biomolecular Sciences Genomics Initiative and the Basenji Health Endowment Inc., Poynette, WI and. The DNA Zoo initiative funded the Hi-C data collection and analyses. RJE is funded by ARC LP160100610 and ARC LP180100721. MF is funded by NHMRC APP5121190. ELA was supported by an NSF Physics Frontiers Center Award (PHY1427654), the Welch Foundation (Q-1866), a USDA Agriculture and Food Research Initiative Grant (2017-05741), and an NIH Encyclopedia of DNA Elements Mapping Center Award (UM1HG009375).
History
Publication Date
2021-12-01Journal
BMC GenomicsVolume
22Issue
1Article Number
ARTN 188Pagination
19p.Publisher
BMCISSN
1471-2164Rights Statement
The Author reserves all moral rights over the deposited text and must be credited if any re-use occurs. Documents deposited in OPAL are the Open Access versions of outputs published elsewhere. Changes resulting from the publishing process may therefore not be reflected in this document. The final published version may be obtained via the publisher’s DOI. Please note that additional copyright and access restrictions may apply to the published version.Publisher DOI
Usage metrics
Categories
No categories selectedKeywords
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC