figshare
Browse
Table_4.xls (5.5 kB)

Low-memory digital normalization.

Download (0 kB)
dataset
posted on 2014-07-25, 03:13 authored by Qingpeng Zhang, Jason PellJason Pell, Rosangela Canino-Koning, Adina Chuang Howe, C. Titus BrownC. Titus Brown

The results of digitally normalizing a 5 m read E. coli data set (1.4 GB) to C = 20 with k = 20 under several memory usage/false positive rates. The false positive rate (column 1) is empirically determined. We measured reads remaining, number of “true” k-mers missing from the data at each step, and the number of total k-mers remaining. Note: at high false positive rates, reads are erroneously removed due to inflation of k-mer counts.

History