figshare
Browse
1/2
36 files

Capture of hemoglobin clusters

Combining high-throughput sequencing with targeted sequence capture has become
an attractive tool to study specific genomic regions of interest. Most studies have so far
focused on the exome using short-read technology. These approaches are not designed
to capture intergenic regions needed to reconstruct genomic organization, including
regulatory regions and gene synteny. Here, we demonstrate the power of combining
targeted sequence capture with long-read sequencing technology for comparative
genomic analyses of the hemoglobin (Hb) gene clusters across eight species separated
by up to 70 million years. Guided by the reference genome assembly of the Atlantic cod
(Gadus morhua) together with genome information from draft assemblies of selected
codfishes, we designed probes covering the two Hb gene clusters. Use of custom-made
barcodes combined with PacBio RSII sequencing led to highly continuous assemblies of
the LA (~100kb) and MN (~200kb) clusters, which include syntenic regions of coding
and intergenic sequences. Our results revealed an overall conserved genetic
organization and synteny of the Hb genes within this lineage, yet with several, lineage
specific gene duplications. Moreover, for some of the species examined, we identified
amino acid substitutions at two sites in the Hbb1 gene as well as length polymorphisms
in its regulatory region, which has previously been linked to temperature adaptation in
Atlantic cod populations. This study highlights the use of targeted long-read capture as
a versatile approach for comparative genomic studies by generation of a cross-species
genomic resource elucidating the evolutionary history of the Hb gene family across the
highly divergent group of codfishes.


160111_Gmorhua_capture_probes.zip Sequences of the different probes.
Hb_target_region_gadmor2.fasta.gz The target regions from the gadMor2 genome assembly.
brosme_brosme.rawreads.fastq.gz The raw reads from capture for Brosme brosme.
brosme_brosme_hb_assembly.fasta.gz The assembled hemoglobin regions of Brosme brosme.
gadiculus_argenteus.rawreads.fastq.gz The raw reads from capture for Gadiculus argenteus.
gadiculus_argenteus_hb_assembly.fasta.gz The assembled hemoglobin regions of Gadiculus argenteus.
gadus_morhua.rawreads.fastq.gz The raw reads from capture for Gadus morhua.
gadus_morhua_hb_assembly.fasta.gz The assembled hemoglobin regions of Gadus morhua.
lota_lota.rawreads.fastq.gz The raw reads from capture for Lota lota.
lota_lota_hb_assembly.fasta.gz The assembled hemoglobin regions of Lota lota.
macrourus_berglax.rawreads.fastq.gz The raw reads from capture for Macrourus berglax.
macrourus_berglax_hb_assembly.fasta.gz The assembled hemoglobin regions of Macrourus berglax.
melanogrammus_aeglefinus.rawreads.fastq.gz The raw reads from capture for Melanogrammus aeglefinus.
melanogrammus_aeglefinus_hb_assembly.fasta.gz The assembled hemoglobin regions of Melanogrammus aeglefinus.
merluccius_merluccius.rawreads.fastq.gz The raw reads from capture for Merluccius merluccius.
merluccius_merluccius_hb_assembly.fasta.gz The assembled hemoglobin regions of Merluccius merluccius.
muraenolepsis_marmoratus.rawreads.fastq.gz The raw reads from capture for Muraenolepsis marmoratus.
muraenolepsis_marmoratus_hb_assembly.fasta.gz The assembled hemoglobin regions of Muraenolepsis marmoratus.
Boreogadus_saida_fish_2LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.
Boreogadus_saida_fish_2MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.
Gadiculus_argentus_fish_8LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.
Gadiculus_argentus_fish_8MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.
Lota_lota_fish_11LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.
Lota_lota_fish_11MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.
Macrourus_berglax_fish_17LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.
Macrourus_berglax_fish_17MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.
Melanogrammus_aeglefinus_fish_5LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.
Melanogrammus_aeglefinus_fish_5MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.
Merluccius_merluccius_fish_13LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.
Merluccius_merluccius_fish_13MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.
Muraenolepis_marmoratus_fish_20LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.
Muraenolepis_marmoratus_fish_20MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.
Theragra_chalcogramma_fish_7LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.
Theragra_chalcogramma_fish_7MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.
Trachyrincus_scabrus_fish_36LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.
Trachyrincus_scabrus_fish_36MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.

History

Usage metrics

    Categories

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC