Simulated haplotype phasing by correlation of unique sequences within barcode-defined groups. A. StapletonJames KimJeongwoon P. HamiltonJohn WuMing C. IrberLuiz MaddamsettiRohan BrineyBryan NewtonLinsey BurtonDennis R. BrownC. Titus ChanChristina Robin BuellC. WhiteheadTimothy A. 2016 <p>Short unique sequences were identified at each end of the two variants (Env1_1 and Env1_2 from variant 1, Env2_1 and Env2_2 from variant 2). Each barcode-defined group of short reads was searched for the four sequences. A high number of counts of occurrences of a unique sequence from near the 5’ end of one <i>env</i> variant (Env1_1, Env2_1) in a barcode-defined group of short reads is a strong predictor of a high number of occurrences of a second unique sequence from the 3’ end of the same variant (Env1_2, Env2_2) in the same group, and also a strong predictor of a low number of occurrences of the unique sequence from the 3’ end of the other variant. Therefore, the haplotype across these two loci in a given barcoded individual can be phased regardless of the length or identity of the intervening sequence.</p>