Variation in genome assembly quality as measured by two metrics

posted on 01.05.2014, 15:35 by Keith Bradnam

Results from 44 genome assemblies produced by 7 different assemblers measured by 2 metrics:

x-axis: N50 length  in bp of assembly scaffolds (log scale)

y-axis: Number of 248 Core Eukaryotic Genes (CEGs) present in each assembly, as detected by CEGMA program.

The legend names the principle genome assembly tool used (other software may have been used to generate/edit the assembly).

Assemblies represent various eukaryotic species, all of which were submitted to the Korf Lab to have CEGMA run against them.

The main finding from this figure is that there is no obvious relationship between N50 length and number of core genes present and that there is no consistent advantage in any one assembler over any other.