Computational Methods for Filtering Contaminants from NGS Data

Duncan Murdock, Janna Fierst

Next-Generation Sequencing (NGS) technologies have enabled many more and more complex genomic analyses to be performed. Many of these analyses depend on the existence or creation of a reference genome, or a construction of sequence data into a representation of the target organism’s genome. However, genome assembly tools cannot distinguish between the target organism and any contaminants that may have been sequenced at the same time. In addition, it is not feasible to isolate most target organisms from any parasites, symbionts, or other potential contaminants. Thus, new methods must be devised. Here, we use NGS data from an inbred line of C. remanei known as PX356 to compare three methods for filtering DNA sequence data for assembly.


