figshare
Browse

File(s) stored somewhere else

Please note: Linked content is NOT stored on Ag Data Commons and we can't guarantee its availability, quality, security or accept any liability.

Genome Sequences and Sequence Data of African Swine Fever virus from the Dominican Republic 2021

dataset
posted on 2024-06-11, 07:15 authored by USDA/APHIS/FADDL
Diagnostic samples from swine were provided to the USDA Foreign Animal Disease Diagnostic Laboratory (FADDL) for testing originating from the Dominican Republic. Initial tests indicated the presence of African Swine Fever virus, which was subsequently submitted for whole genome sequencing for confirmation and characterization. Samples were provided as blood and tissue obtained from affected swine in May through early August 2021. Samples were sequenced on the Oxford Nanopore PromethION and the Illumina MiSeq platforms. The best reference from the NCBI RefSeq database was determined by aligning reads against all references and selecting the reference genome with the closest identity and breadth of coverage. The closest reference genome by identity was Georgia 2007/1 (NC_044959.2). Reference-guided alignment was performed using BWA-MEM (0.7.17-r1188) for Illumina data and Minimap2 (version 2.21-r1071) for Oxford Nanopore data. Variant calling against the Georgia 2007/1 reference was performed using Freebayes (version 1.3.4) against the population of samples. Variants were filtered based on quality score, depth, and log-odds ratio to remove low-confidence variants. This produced a sample set with 21 total variant sites, all of which were single nucleotide polymorphisms. These variants were applied to the Georgia 2007/1 genome to produce the final consensuses. Annotations were transferred from the Georgia 2007/1 genome, and changes in gene coding sequences were modified accordingly. The BioSamples in this BioProject include raw data files and genome consensus sequences against Georgia 2007/1 based on reference-guided assembly. These genomes have not been finished and may contain additional structural variations that were not detected by the sequencing performed here.

History

Data contact name

BioProject Curation Staff

Publisher

National Center for Biotechnology Information

Temporal Extent Start Date

2021-10-04

Theme

  • Non-geospatial

ISO Topic Category

  • biota

National Agricultural Library Thesaurus terms

sequence analysis

Pending citation

  • No

Public Access Level

  • Public

Accession Number

PRJNA768333

Preferred dataset citation

It is recommended to cite the accession numbers that are assigned to data submissions, e.g. the GenBank, WGS or SRA accession numbers. If individual BioProjects need to be referenced, state that "The data have been deposited with links to BioProject accession number PRJNA768333 in the NCBI BioProject database (https://www.ncbi.nlm.nih.gov/bioproject/)."

Usage metrics

    Categories

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC