figshare
Browse
1162304_Edwards,R_2021.pdf (2.29 MB)

Chromosome-length genome assembly and structural variations of the primal Basenji dog (Canis lupus familiaris) genome

Download (2.29 MB)
journal contribution
posted on 2021-07-01, 01:37 authored by RJ Edwards, MA Field, JM Ferguson, O Dudchenko, J Keilwagen, BD Rosen, GS Johnson, ES Rice, LD Hillier, JM Hammond, SG Towarnicki, A Omer, R Khan, K Skvortsova, O Bogdanovic, RA Zammit, EL Aiden, WC Warren, John Ballard
Background: Basenjis are considered an ancient dog breed of central African origins that still live and hunt with tribesmen in the African Congo. Nicknamed the barkless dog, Basenjis possess unique phylogeny, geographical origins and traits, making their genome structure of great interest. The increasing number of available canid reference genomes allows us to examine the impact the choice of reference genome makes with regard to reference genome quality and breed relatedness. Results: Here, we report two high quality de novo Basenji genome assemblies: a female, China (CanFam_Bas), and a male, Wags. We conduct pairwise comparisons and report structural variations between assembled genomes of three dog breeds: Basenji (CanFam_Bas), Boxer (CanFam3.1) and German Shepherd Dog (GSD) (CanFam_GSD). CanFam_Bas is superior to CanFam3.1 in terms of genome contiguity and comparable overall to the high quality CanFam_GSD assembly. By aligning short read data from 58 representative dog breeds to three reference genomes, we demonstrate how the choice of reference genome significantly impacts both read mapping and variant detection. Conclusions: The growing number of high-quality canid reference genomes means the choice of reference genome is an increasingly critical decision in subsequent canid variant analyses. The basal position of the Basenji makes it suitable for variant analysis for targeted applications of specific dog breeds. However, we believe more comprehensive analyses across the entire family of canids is more suited to a pangenome approach. Collectively this work highlights the importance the choice of reference genome makes in all variation studies.

Funding

This work was supported by the University of New South Wales/School of Biotechnology and Biomolecular Sciences Genomics Initiative and the Basenji Health Endowment Inc., Poynette, WI and. The DNA Zoo initiative funded the Hi-C data collection and analyses. RJE is funded by ARC LP160100610 and ARC LP180100721. MF is funded by NHMRC APP5121190. ELA was supported by an NSF Physics Frontiers Center Award (PHY1427654), the Welch Foundation (Q-1866), a USDA Agriculture and Food Research Initiative Grant (2017-05741), and an NIH Encyclopedia of DNA Elements Mapping Center Award (UM1HG009375).

History

Publication Date

2021-12-01

Journal

BMC Genomics

Volume

22

Issue

1

Article Number

ARTN 188

Pagination

19p.

Publisher

BMC

ISSN

1471-2164

Rights Statement

The Author reserves all moral rights over the deposited text and must be credited if any re-use occurs. Documents deposited in OPAL are the Open Access versions of outputs published elsewhere. Changes resulting from the publishing process may therefore not be reflected in this document. The final published version may be obtained via the publisher’s DOI. Please note that additional copyright and access restrictions may apply to the published version.