10.1371/journal.pone.0147229.g003 James A. Stapleton James A. Stapleton Jeongwoon Kim Jeongwoon Kim John P. Hamilton John P. Hamilton Ming Wu Ming Wu Luiz C. Irber Luiz C. Irber Rohan Maddamsetti Rohan Maddamsetti Bryan Briney Bryan Briney Linsey Newton Linsey Newton Dennis R. Burton Dennis R. Burton C. Titus Brown C. Titus Brown Christina Chan Christina Chan C. Robin Buell C. Robin Buell Timothy A. Whitehead Timothy A. Whitehead Individual assembly of full-length <i>env</i> genes from a mixture of two variants. Public Library of Science 2016 cancer cell lines Long Reads acid molecules approach method mixture dna HIV env gene variants mRNA sequences animal genomic samples 11.6 kilobases length sequencing libraries custom equipment 2016-01-28 12:38:22 Figure https://plos.figshare.com/articles/figure/_Individual_assembly_of_full_length_env_genes_from_a_mixture_of_two_variants_/1639542 <p>(a) The length distribution of the synthetic long reads (minimum length 1 kb) shows assembly of full-length 3-kb <i>env</i> gene sequences. (b) 1,173 synthetic reads between 1.5 and 3.2 kb in length were aligned to each of the two original <i>env</i> sequences (<i>env1</i> and <i>env2</i>). The alignment match rates are shown as a heatmap, with each synthetic read represented by a thin horizontal line. The majority of the synthetic reads align with low error to exactly one of the two original sequences, indicating high accuracy and a low rate of chimera formation. Chimeric reads would be expected to match both original sequences at intermediate accuracies. (c) Scatter plot showing the mismatch rates of each synthetic read against the two known <i>env</i> sequences. Synthetic reads (open circles to emphasize extensive overlap) cluster into two distinct groups along the axes (near-zero mismatch rate). Even the sixteen reads that do not fall on the clusters are distant from three manually created mock chimeras (crosses), indicating a low frequency of chimera formation.</p>