pone.0237412.s023.pdf (170.94 kB)
Download file

4conv2pool4norm recall for regulatory sequence prediction for different cell lines.

Download (170.94 kB)
journal contribution
posted on 01.12.2020, 18:30 by Louisa-Marie Krützfeldt, Max Schubach, Martin Kircher

Ten CNN models of the 4conv2pool4norm architecture were trained each on DHS datasets (positive) and corresponding negative sets of k-mer shuffled sequences (k = 2, k = 7) or genomic background sequences (tGC = 0.02) for A549 or MCF-7 cells. A549 and MCF-7 cell lines are represented in our data with two training datasets each, which are labeled as A and B, respectively. Model performance was evaluated based on recall for hold-out sets (chromosome 8). The table summarizes mean and standard deviation across ten trained models. There are seven different hold-out sets derived from different cell lines and we assess model generalization across cell-types. Datasets are named according to S1 Table. Respective results for the gkm-SVM models are available Table 1, results for CNN models of 2conv2norm architecture are available in S5 Table.