Neem (Azadirachta indica A. Juss.), an evergreen tree of the
Meliaceae family, is known for its medicinal, cosmetic, pesticidal and
insecticidal properties.
We had previously sequenced and published the draft
genome of the plant, using mainly short read sequencing data. In this
report, we present an improved genome assembly
generated using additional short reads from Illumina and long reads from
Pacific
Biosciences SMRT sequencer. We assembled short
reads and error corrected long reads using Platanus, an assembler
designed
to perform well for heterozygous genomes. The
updated genome assembly (v2.0) yielded 3- and 3.5-fold increase in N50
and N75,
respectively; 2.6-fold decrease in the total number
of scaffolds; 1.25-fold increase in the number of valid transcriptome
alignments; 13.4-fold less mis-assembly and
1.85-fold increase in the percentage repeat, over the earlier assembly
(v1.0).
The current assembly also maps better to the genes
known to be involved in the terpenoid biosynthesis pathway. Together,
the
data represents an improved assembly of the A. indica genome. The raw data described in this manuscript are submitted to the NCBI Short Read Archive under the accession numbers
SRX1074131, SRX1074132, SRX1074133, and SRX1074134 (SRP013453).