figshare
Browse
1-s2.0-S187249731830591X-main.pdf (2.67 MB)

Mitigating the effects of reference sequence bias in single-multiplex massively parallel sequencing of the mitochondrial DNA control region.

Download (2.67 MB)
journal contribution
posted on 2019-09-16, 13:37 authored by Tunde I. Huszar, Jon H. Wetton, Mark A. Jobling
Sequence analysis of the mitochondrial DNA (mtDNA) control region can provide forensically useful information, particularly in challenging samples where autosomal DNA profiling fails. Sub-division of the 1122-bp region into shorter PCR fragments improves data recovery, and such fragments can be analysed together via massively parallel sequencing (MPS). Here, we generate mtDNA data using the prototype PowerSeq™ Auto/Mito/Y System (Promega) MPS assay, in which a single PCR reaction amplifies ten overlapping amplicons of the control region, in a set of 101 highly diverse samples representing most major clades of the mtDNA phylogeny. The overlapping multiplex design leads to non-uniform coverage in the regions of overlap, where it is further increased by short amplicons generated alongside the intended products. Primer sequences in targeted amplification libraries are a potential source of reference sequence bias and thus should be removed, but the proprietary nature of the primers in commercial kits necessitates an alternative approach that minimises data loss: here, we introduce the bioinformatic selection of sequencing reads spanning putative primer sites (Overarching Read Enrichment Option, OREO). While OREO performs well in mitigating the effects of primer sequences at the ends of sequence reads, we still find evidence of the internalisation of primer-derived sequences by overlap extension, which may compromise the ability to call variants or to measure heteroplasmy in primer-binding regions. The commercially available PowerSeq™ CRM Nested System design prevents primer internalisation, as shown in a reanalysis of a subset of 57 samples that contain possible heteroplasmies. In combination with OREO, the CRM Nested kit mitigates reference sequence bias, allowing heteroplasmic variants to be estimated down to a 5% threshold. Provided appropriate steps are taken in data processing, single-reaction multiplex assays represent robust tools to analyse mtDNA control region variation. The OREO approach will allow users to bypass the effects of unknown primer sequences in any single-reaction tiled multiplex and eliminate primer-derived bias in overlapping amplicon sequencing studies, in both forensic and non-forensic settings.

Funding

TIH was supported by a Biotechnology and Biological Sciences Research Council iCASE studentship, grant ref. BB/M016706/1, partnered by Key Forensic Services. We thank Promega, and Andy Hopwood & Nikki Peake in particular, for access to the PowerSeq™ Auto/Mito/Y System prototype kit, and the PowerSeq™ CRM Nested System kit and reagents. We gratefully acknowledge colleagues who contributed DNA samples, and NUCLEUS Genomic Services at the University of Leicester for training and access to Illumina MiSeq sequencing. This research used the SPECTRE High Performance Computing Facility at the University of Leicester for data analysis.

History

Citation

Forensic Science International: Genetics, 2019, 40, pp. 9-17

Author affiliation

/Organisation/COLLEGE OF LIFE SCIENCES/Biological Sciences/Genetics and Genome Biology

Version

  • VoR (Version of Record)

Published in

Forensic Science International: Genetics

Publisher

Elsevier for International Society for Forensic Genetics(ISFG)

eissn

1878-0326

Acceptance date

2019-01-15

Copyright date

2019

Available date

2019-09-16

Publisher version

https://www.sciencedirect.com/science/article/pii/S187249731830591X?via=ihub

Notes

Supplementary material related to this article can be found, in the online version, at doi:https://doi.org/10.1016/j.fsigen.2019.01.008.

Language

en

Usage metrics

    University of Leicester Publications

    Categories

    No categories selected

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC